ACID
Transaction enforcement properties of a database: atomicity, consistency, isolation, durability.
Batch Processing
Batch processing is a method of processing data and tasks in bulk, where a group of similar operations or computations are executed together as a batch.
Data Analytics
Data analytics involves examining, transforming, and interpreting data sets to uncover meaningful insights, patterns, and trends.
Data Applications
Data applications are software programs or systems designed to gather, process, analyze, and present data in a user-friendly and meaningful way.
Data Engineering
Data engineering refers to designing, building, and maintaining the architecture required to collect, transform, and store data.
Data Governance
Data governance is the set of practices that ensure the appropriate use of an organization's data assets.
Data Lineage
Data lineage refers to the documented and visual representation of the path and transformations that data undergoes.
Data Modelling
Data modeling creates a structured and visual representation of how data is organized, related, and stored within a database or information system.
Data Sharing
Data sharing is the deliberate and controlled practice of exchanging or providing access to data between systems.
Data Transformation
Data transformation is converting, altering, or reformatting raw data from one state or structure into another.
Data Warehouse
A data warehouse is a centralized repository that stores historical and current data from various sources within an organization.
dbt
dbt (data build tool) is an open-source command-line and analytics engineering workflow tool.
Feature Engineering
Feature engineering is the process of selecting within a dataset to enhance machine learning models' performance and predictive capabilities.
Generative AI Apps
Generative AI apps are applications powered by artificial intelligence algorithms that can autonomously create content.
Google BigQuery
Google BigQuery is a fully managed, serverless data warehousing and analytics platform offered by Google Cloud.
HIPAA - Health Insurance Portability and Accountability Act
The Health Insurance Portability and Accountability Act (HIPAA) is a regulatory framework in the healthcare vertical within the United States.
LLMs - Large Language Models
Large Language Models (LLMs) are advanced artificial intelligence systems.
Machine Learning
Machine Learning is a branch of artificial intelligence that involves the development of algorithms.
MLOps - Machine Learning Operations
MLOps aims to ensure efficient collaboration, enabling the seamless integration of machine learning into production systems.
Microsoft Azure
Microsoft Azure is a comprehensive cloud computing platform and set of services provided by Microsoft.
Parquet
Parquet is an open-source columnar file format optimized for efficiently storing and processing large datasets.
PHI - Protected Health Information
Protected Health Information (PHI) refers to any identifiable health-related data.
PII - Personally Identifiable Information
Personally Identifiable Information (PII) refers to any data that can be used to uniquely identify, locate, or contact an individual.
PowerBI
Power BI is a business analytics and data visualization platform developed by Microsoft.
Sales Analytics
Sales analytics uses data analysis and statistical techniques to examine sales-related information, trends, and patterns.
Semi-Structured
Semi-structured data refers to information that does not fit neatly into a rigid database but contains some level of organization or hierarchy.
Sigma Computing
Sigma Computing is a cloud-based analytics and business intelligence platform that empowers non-technical users.
Snowflake
Snowflake is a cloud-based data warehousing platform that provides scalable and flexible solutions for large volumes of data.
Snowpipe
Snowpipe is an automated data ingestion feature within the Snowflake Data Cloud platform.
Snowpipe Streaming
Snowpipe Streaming is a feature within the Snowflake Data Cloud platform that enables continuous and real-time data ingestion from streaming sources.
SQL to Snowflake Migration
SQL to Snowflake migration refers to transferring and adapting existing database systems, applications, or data workloads that use SQL to Snowflake.
Streaming Data
Streaming data refers to a continuous flow of real-time, time-sensitive data from various sources such as sensors, devices, social media, or applications.