Access Control in RAG: Who Gets to See What

The Problem Nobody Wants to Think About

You build a beautiful RAG system. It indexes your entire corporate knowledge base. Policies, financial reports, HR documents, engineering specs, board meeting notes, salary bands, M&A plans, client contracts, disciplinary records.

Then an intern asks it a question and gets back the CEO's compensation package.

This isn't hypothetical. I've seen it happen. More than once.

The default state of a RAG system is "everyone can see everything." And in an enterprise, that's not just inconvenient. It's a compliance violation, a legal liability, and a trust-destroying event.

Access control in RAG isn't a nice-to-have. It's table stakes. Build it in from day one or don't build the system at all.

Why It's Harder Than You Think

Traditional access control works at the document level. This file is readable by these groups. Check permissions, grant or deny. Done.

RAG breaks this model in three ways.

Chunking fragments permissions. A document might have public sections and confidential sections. When you chunk it, those sections become independent vectors. The chunk doesn't inherently know it came from a restricted part of a document. It is worth reading about least-privilege principles for agents alongside this.

Retrieval happens before display. In traditional systems, the user clicks on a file, permissions are checked, access is granted or denied. In RAG, retrieval happens server-side, and the retrieved chunks influence the LLM's response. Even if you don't show the chunk directly, its content leaks through the generated answer.

Aggregation creates new sensitivities. Individual chunks might be fine independently. But combining salary data from HR, project assignments from engineering, and performance reviews creates a composite picture that should be restricted. The whole is more sensitive than the parts.

Architecture: Pre-Retrieval Filtering

The most reliable approach. Filter chunks BEFORE retrieval based on the user's permissions.

class SecureRetriever:
    def __init__(self, vector_store, permission_service):
        self.vectors = vector_store
        self.permissions = permission_service

    async def search(self, query: str, user: User, top_k: int = 10):
        # Step 1: Get user's accessible document IDs
        accessible_docs = await self.permissions.get_accessible_documents(
            user_id=user.id,
            groups=user.groups,
            roles=user.roles,
        )

        # Step 2: Search with metadata filter
        results = await self.vectors.search(
            query=query,
            top_k=top_k,
            filter={
                "document_id": {"$in": accessible_docs}
            },
        )

        return results

This is clean and secure. The vector store never returns chunks the user can't see. No leakage possible.

The downside: maintaining the accessible document list per user can be expensive. If you have 100,000 documents and complex group-based permissions, that filter can get large.

Optimization: Permission Tags

Instead of filtering by individual document IDs, tag chunks with permission groups at ingestion time.

async def ingest_document(document, chunker, embedder, vector_store):
    # Get document permissions from source system
    permissions = await get_document_permissions(document.id)
    permission_tags = permissions.to_tags()
    # e.g., ["group:engineering", "group:senior-engineering", "role:admin"]

    chunks = chunker.chunk(document)
    for chunk in chunks:
        embedding = await embedder.embed(chunk.text)
        await vector_store.upsert(
            id=chunk.id,
            embedding=embedding,
            metadata={
                "document_id": document.id,
                "permission_tags": permission_tags,
                "source": document.source,
                "updated_at": document.updated_at,
            }
        )

At query time, filter on permission tags instead of document IDs:

async def search(self, query: str, user: User, top_k: int = 10):
    user_tags = self.permissions.get_user_tags(user)
    # e.g., ["group:engineering", "role:team-lead", "department:platform"]

    results = await self.vectors.search(
        query=query,
        top_k=top_k,
        filter={
            "permission_tags": {"$containsAny": user_tags}
        },
    )
    return results

This is faster because permission tags are small, indexed metadata fields rather than lists of 100K document IDs.

The Ingestion Challenge: Keeping Permissions in Sync

Permissions change. People join teams. People leave. Documents get reclassified. Projects go from confidential to public.

Your RAG index needs to reflect these changes, and it needs to reflect them promptly. A document that was restricted yesterday and unrestricted today should be retrievable. A document that was public yesterday and restricted today must NOT be retrievable by unauthorized users.

class PermissionSyncPipeline:
    async def sync(self):
        """Run periodically to keep permissions in sync."""
        # Get all permission changes since last sync
        changes = await self.permission_source.get_changes(
            since=self.last_sync_timestamp
        )

        for change in changes:
            if change.type == "document_permission_changed":
                # Update all chunks for this document
                new_tags = change.new_permissions.to_tags()
                await self.vector_store.update_metadata(
                    filter={"document_id": change.document_id},
                    metadata={"permission_tags": new_tags},
                )

            elif change.type == "document_deleted":
                await self.vector_store.delete(
                    filter={"document_id": change.document_id}
                )

        self.last_sync_timestamp = now()

How often to sync depends on your risk tolerance. For most enterprises, hourly is fine. For sensitive environments (healthcare, finance, legal), you might need real-time sync via webhooks. For a deeper look, see data leakage risks.

Sub-Document Permissions

Here's where it gets properly complicated. Some documents have mixed sensitivity.

A board meeting transcript might have public announcements, confidential financials, and restricted personnel discussions. One document, three permission levels.

class SubDocumentPermissionChunker:
    def chunk_with_permissions(self, document, sections):
        """
        Chunk document respecting section-level permissions.
        Never merge chunks across permission boundaries.
        """
        chunks = []
        for section in sections:
            section_chunks = self.chunker.chunk(section.text)
            for chunk in section_chunks:
                chunk.metadata["permission_tags"] = section.permission_tags
                chunk.metadata["sensitivity_level"] = section.sensitivity
                chunks.append(chunk)

        return chunks

The key rule: never merge content across permission boundaries during chunking. If section A is public and section B is confidential, they must be separate chunks even if they're small enough to combine.

Post-Retrieval Verification

Belt and suspenders. Even with pre-retrieval filtering, add a post-retrieval permission check.

async def search_with_verification(self, query, user, top_k=10):
    # Pre-filtered retrieval
    candidates = await self.secure_search(query, user, top_k=top_k * 2)

    # Post-retrieval verification
    verified = []
    for chunk in candidates:
        if await self.permissions.verify_access(
            user_id=user.id,
            document_id=chunk.metadata["document_id"],
            chunk_id=chunk.id,
        ):
            verified.append(chunk)
            if len(verified) >= top_k:
                break

    return verified

Why double-check? Because metadata filters can have bugs. Permission tags can be stale. The verification step catches any leakage that slipped through pre-filtering.

Audit Trail: Non-Negotiable

Every retrieval should be logged with the user identity, what was retrieved, and what permissions were checked.

class AuditedRetriever:
    async def search(self, query, user, top_k=10):
        results = await self.secure_retriever.search(query, user, top_k)

        await self.audit_log.record({
            "timestamp": now(),
            "user_id": user.id,
            "user_groups": user.groups,
            "query": query,
            "retrieved_chunk_ids": [r.id for r in results],
            "retrieved_document_ids": list(set(
                r.metadata["document_id"] for r in results
            )),
            "permission_check": "pre_filter + post_verify",
        })

        return results

This isn't optional for regulated industries. And even for unregulated ones, when someone asks "did anyone access X?", you need to be able to answer.

Integration with Identity Providers

Don't build your own permission system. Integrate with what the organization already uses.

class EnterprisePermissionService:
    def __init__(self, idp_client):
        self.idp = idp_client  # Okta, Azure AD, Google Workspace, etc.

    async def get_user_tags(self, user: User) -> list[str]:
        """Resolve user's effective permissions from IdP."""
        # Direct groups
        groups = await self.idp.get_user_groups(user.id)
        # Transitive groups (nested group membership)
        all_groups = await self.idp.resolve_transitive_groups(groups)
        # Roles
        roles = await self.idp.get_user_roles(user.id)
        # Department
        dept = await self.idp.get_user_department(user.id)

        tags = []
        tags.extend(f"group:{g}" for g in all_groups)
        tags.extend(f"role:{r}" for r in roles)
        tags.append(f"department:{dept}") The related post on [keeping the index current](/blog/rag-document-sync-incremental) goes further on this point.

        return tags

This ensures that when someone is removed from a group in your identity provider, they immediately lose access in your RAG system (at next sync). No separate permission management to maintain.

The Minimum Viable Approach

If this all feels overwhelming, here's the simplest approach that's still secure:

Tag every chunk with its source document's permission group at ingestion
Filter on permission group at query time
Sync permissions daily
Log all retrievals

That's four things. Not twenty. Start there. Add sub-document permissions and real-time sync when you need them.

The worst approach is no access control at all, followed closely by "we'll add it later." Later never comes. Or it comes after the incident. Build it in from the start.