spark-nlp and nlu
These two tools are ecosystem siblings within the John Snow Labs NLP ecosystem; NLU provides a high-level, simplified API for accessing and utilizing the extensive NLP models and functionalities implemented by Spark NLP.
About spark-nlp
JohnSnowLabs/spark-nlp
State of the Art Natural Language Processing
Builds on Apache Spark for distributed NLP at scale, supporting 100,000+ pretrained pipelines and models across 200+ languages. Enables transformer architectures (BERT, RoBERTa, GPT-2, Llama, etc.) natively on JVM ecosystems (Java, Scala, Kotlin) while supporting model imports from TensorFlow, ONNX, OpenVINO, and GGUF formats. Covers end-to-end tasks including tokenization, embeddings, NER, machine translation, question answering, image captioning, and speech recognition.
About nlu
JohnSnowLabs/nlu
1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
Scores updated daily from GitHub, PyPI, and npm data. How scores work