How retrieval-augmented generation over your own data works and what drives its cost.
dgm is an independent osFoundry integration partner — not affiliated with osFoundry’s maker (OS LLC), and dgm has no completed client integrations yet.
Retrieval-augmented generation (RAG) is how you make AI answer accurately from your own documents. This guide explains the approach and what drives its cost.
How RAG works
Your documents are indexed; when someone asks a question, the relevant passages are retrieved and given to the model so it answers from your data and can cite sources — reducing hallucination.
What drives the cost
The volume of documents, how often you re-index, query volume (model usage), and where it runs. A modest internal knowledge base is inexpensive; a large, high-traffic system costs more.
Keeping it controlled
Run it on a model-agnostic platform so you can choose cost-effective models and keep data where it should be. osFoundry’s managed cloud pins data to the US, EU or Japan — it does not currently offer a Hong Kong managed region (its nearest managed region is Japan). To keep data in Hong Kong, the honest path is self-hosting osFoundry (BYO Cloud) inside a Hong Kong cloud region such as AWS Asia Pacific (Hong Kong) ap-east-1, Microsoft Azure East Asia (Hong Kong SAR) or Google Cloud asia-east2 (Hong Kong), or running models locally on-device.
Where dgm fits
dgm is an independent integration partner that helps Hong Kong businesses adopt osFoundry — scoping a first use case, handling the build, and connecting AI to the systems you already run. dgm is independent of osFoundry’s maker (OS LLC) and has no completed client integrations yet, so everything described here is a service offered, not a past result. If you want to scope a practical first project, dgm can help you map it out.