What is a Large Language Model (LLM)?
A Large Language Model (LLM) is a neural network — typically a transformer architecture with billions of parameters — trained on massive text corpora to predict and generate language. LLMs power modern AI assistants (ChatGPT, Claude, Gemini), code generation, summarisation, and the entire “generative AI” wave that emerged from 2022. The defining “large” threshold is informal but generally means 1B+ parameters.
Key LLM characteristics
- Foundation model architecture: pre-trained on broad data, fine-tuned for specific tasks.
- Emergent capabilities: at sufficient scale, capabilities (reasoning, multi-step tasks) emerge that smaller models lack.
- In-context learning: LLMs can perform new tasks from examples in the prompt without retraining.
- Probabilistic output: they generate by sampling next tokens, leading to creative but sometimes unfaithful output.
LLM categories
- Proprietary closed: OpenAI GPT-4/5, Anthropic Claude, Google Gemini — API access, no weights.
- Open weight: Meta Llama, Mistral, Qwen — model weights released; can be self-hosted.
- Specialised: code (Codex, Copilot), medical, legal — fine-tuned for vertical.
Legal and operational implications for startups
- EU AI Act compliance: GPAI (general-purpose AI) obligations starting August 2025; transparency, training-data summaries, copyright policy required.
- KVKK / GDPR: personal data in prompts triggers controller/processor analysis; aggregated logs may also constitute personal data.
- Output liability: LLM hallucinations have caused real harm; commercial deployments need disclaimers and human review for high-stakes use.
- IP and training data: ongoing litigation (NYT v. OpenAI, Getty v. Stability) shapes whether training on copyrighted material is fair use.
Türk pazarında LLM kullanımı
Türk startup’ları için LLM kullanımı 2023-2025’te hızla yaygınlaştı — OpenAI ve Anthropic API’ları en yaygın; lokalizasyon için Türkçe-optimize Türkiye-merkezli açık kaynak modeller (Trendyol, Hesap.ai gibi şirketlerin model kartları) gelişmekte. KVKK kapsamında kişisel veri içeren prompt’lar yurtdışına transfer açısından özel önlem gerektirir.
Do: classify the data sent to LLM APIs by sensitivity; log prompts and outputs for incident response; review high-stakes outputs with human in the loop.
Don’t: assume the LLM is “private” because you have an API key — privacy depends on contractual terms and data handling commitments, not API access.