Table of Contents

Return to AI-DL-ML-LLM GitHub, AI-DL-ML-LLM Focused Companies, Hugging Face AI-DL-ML-LLM Services, AWS AI-DL-ML-LLM Services, Azure AI-DL-ML-LLM Services, GCP AI-DL-ML-LLM Services, IBM Cloud AI-DL-ML-LLM Services, Oracle Cloud AI-DL-ML-LLM Services, OpenAI AI-DL-ML-LLM Services, NVIDIA AI-DL-ML-LLM Services, Intel AI-DL-ML-LLM Services, Kubernetes AI-DL-ML-LLM Services, Apple AI-DL-ML-LLM Services, Meta-Facebook AI-DL-ML-LLM Services, Cisco AI-DL-ML-LLM Services

For the top 15 GitHub repos, ask for 10 paragraphs. e.g. Amazon SageMaker Features, Amazon SageMaker Alternatives, Amazon SageMaker Security, , Amazon SageMaker DevOps


Databricks ML Examples

Introduced in 2023, the Databricks ML Examples repository offers a collection of example notebooks and scripts to demonstrate the use of state-of-the-art Machine Learning and Large Language Models (LLMs) on the Databricks platform. It includes directories such as `llm-models/` for various LLM examples and `llm-fine-tuning/` for fine-tuning scripts, providing practical insights into implementing advanced ML models within the Databricks environment.

https://github.com/databricks/databricks-ml-examples

LLM Foundry

Released in 2023, the LLM Foundry repository by MosaicML, now part of Databricks, contains code for training, fine-tuning, evaluating, and deploying LLMs using Composer and the MosaicML platform. Designed for ease of use, efficiency, and flexibility, it enables rapid experimentation with the latest techniques in LLM development, facilitating scalable and efficient model training.

https://github.com/mosaicml/llm-foundry

Databricks LLM Prompt Engineering

In 2023, the Databricks LLM Prompt Engineering repository was introduced to cover various use cases related to prompt engineering and LLMs. It includes notebooks for experimenting with different prompt engineering techniques and showcases LLM deployment using Databricks Model Serving with GPU support, aiding in the development of effective prompt strategies for LLM applications.

https://github.com/Databricks-NEMEA-Specialists/ml-llm-prompt-engineering

Databricks LLM Foundation Models

The Databricks LLM Foundation Models repository, launched in 2023, provides courseware tested on Databricks Runtime 13.3 LTS for Machine Learning. It offers resources for understanding and utilizing foundation models within the Databricks platform, serving as a valuable educational tool for practitioners aiming to implement LLM solutions.

https://github.com/alikhawaja/databricks-llm-foundation-models

Databricks Academy Large Language Models

Introduced in 2023, the Databricks Academy Large Language Models repository offers courseware focused on LLMs, tested on Databricks Runtime 13.3 LTS for Machine Learning. It provides instructional materials and notebooks designed to educate users on LLM concepts and their practical applications within the Databricks ecosystem.

https://github.com/databricks-academy/large-language-models

Databricks LLM Models

The Databricks LLM Models repository, established in 2023, contains example notebooks for using various LLMs on the Databricks platform. It includes directories for models like Falcon, Llama 2, and Mistral, offering guidance on model loading, inference, and fine-tuning processes, thereby assisting users in implementing these models effectively.

https://github.com/databricks/databricks-ml-examples/tree/master/llm-models

Databricks DIY LLM QA Bot

Released in 2023, the Databricks DIY LLM QA Bot repository provides a solution accelerator for building a question-answering bot using LLMs on the Databricks platform. It includes resources and instructions to help users develop and deploy their own LLM-powered QA bots, facilitating the creation of intelligent conversational agents.

https://github.com/databricks-industry-solutions/diy-llm-qa-bot

DBRX

In March 2024, Databricks released DBRX, an open-source Large Language Model developed by the MosaicML team at Databricks. DBRX features a mixture-of-experts architecture with 132 billion parameters, with 36 billion active per token, and serves as a foundation model for various applications, outperforming other open-source models like LLaMA 2 in several benchmarks.

https://github.com/databricks/dbrx

Databricks MLflow

Introduced in 2018, MLflow is an open-source platform for managing the end-to-end Machine Learning lifecycle. Developed by Databricks, it provides tools for experiment tracking, model deployment, and registry, facilitating reproducible and efficient ML workflows.

https://github.com/mlflow/mlflow

Databricks Delta Lake

Launched in 2019, Delta Lake is an open-source storage layer that brings reliability to data lakes. Developed by Databricks, it provides ACID transactions, scalable metadata handling, and unified streaming and batch data processing, enhancing data lake reliability and performance.

https://github.com/delta-io/delta


Databricks MosaicML

In 2023, Databricks acquired MosaicML, a company specializing in machine learning and large language models (LLMs). This acquisition enhanced Databricks' capabilities in AI model training and deployment, integrating MosaicML's expertise into the Databricks platform to provide more robust ML solutions.

https://github.com/mosaicml

Databricks Dolly

Introduced in 2023, Databricks Dolly is an open-source LLM developed by Databricks. It serves as a foundation model for various applications, enabling organizations to build and customize their own AI models using proprietary data to generate high-quality outputs for specific use cases.

https://github.com/databrickslabs/dolly

Databricks MLflow

Launched in 2018, MLflow is an open-source platform for managing the end-to-end machine learning lifecycle. Developed by Databricks, it provides tools for experiment tracking, model deployment, and a centralized model registry, facilitating reproducible and efficient ML workflows.

https://github.com/mlflow/mlflow

Databricks Delta Lake

Released in 2019, Delta Lake is an open-source storage layer that brings reliability to data lakes. Developed by Databricks, it provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing, enhancing data lake reliability and performance.

https://github.com/delta-io/delta

Databricks Koalas

Introduced in 2019, Koalas is an open-source project that brings a pandas DataFrame API to Apache Spark. Developed by Databricks, it aims to make data scientists more productive when interacting with big data, providing a familiar interface for pandas users to leverage the scalability of Spark.

https://github.com/databricks/koalas

Databricks AutoML

Launched in 2021, Databricks AutoML is a tool that automates the process of machine learning model development. It assists users in quickly generating baseline models, performing hyperparameter tuning, and providing insights into model performance, streamlining the ML workflow.

https://github.com/databricks/databricks-automl

Databricks Feature Store

Released in 2021, the Databricks Feature Store is a centralized repository for sharing and managing ML features. It enables feature discovery, versioning, and reuse across different models and teams, promoting collaboration and consistency in ML projects.

https://github.com/databricks/feature-store

Databricks Repos

Introduced in 2021, Databricks Repos provides repository integration for Databricks notebooks and workflows. It allows users to synchronize their work with version control systems like Git, facilitating collaborative development and versioning of ML code and assets.

https://github.com/databricks/databricks-repos

Databricks Unity Catalog

Launched in 2022, the Databricks Unity Catalog is a unified governance solution for all data and AI assets. It provides fine-grained access controls, centralized metadata management, and audit capabilities, ensuring data security and compliance across the Databricks platform.

https://github.com/databricks/unity-catalog

Databricks Photon

Released in 2022, Databricks Photon is a native vectorized query engine designed to accelerate query performance. It leverages modern CPU architectures to provide faster data processing, enhancing the efficiency of ML and AI workloads on the Databricks platform.

https://github.com/databricks/photon