Articles/GitHub Enterprise API in 2026: PAT vs GitHub Apps, the 15K Rate Limit, Audit Log, EMU, and GHES Version Skew
Tool Reviews

GitHub Enterprise API in 2026: PAT vs GitHub Apps, the 15K Rate Limit, Audit Log, EMU, and GHES Version Skew

GitHub Enterprise Cloud and Server share a brand but ship two distinct API surfaces. The auth choice dictates whether you hit 5K or 15K requests per hour, the audit log lives only on the $21-per-seat tier, EMU accounts cannot do half the things the docs imply, and GHES version skew breaks code that worked yesterday. Here is the practical mapping.

May 27, 2026Read time: 13 min0 topic signals
Reading runway

Context above, deep read below. Use the TOC to move section by section without losing the thread.

Tool Reviews8 sections

A platform engineer at a 300-engineer company sat next to me at a conference last month asking why their freshly built GitHub Apps integration was getting throttled on a Monday morning every week. They had migrated from a fleet of personal access tokens (rationale: "Apps have higher rate limits, the docs say 15,000 per hour") and were now seeing HTTP 429s during the once-a-week dependency scan that polled every repository for vulnerability data. The number they were burning through was about 9,000 calls per hour. The cap they thought they had was 15,000. They were not wrong about the documentation; they had just read the wrong sentence.

GitHub's Enterprise API surface is one of those products where everything technically exists in the docs and nothing is highlighted in the order an integration engineer actually needs it. The headline numbers (5,000 PAT, 15,000 App) are real, but the qualifiers (scales with users and repositories, secondary limits trigger separately, GHEC and GHES behave differently) are the part that breaks production. This piece walks through the GitHub Enterprise API at the layer the docs technically cover but bury: which auth pattern actually scales, where the rate limit cliffs hit, what the audit log gives you that justifies the $21 per seat, what EMU silently breaks, and how to write code that survives GHES version skew.

The five auth methods, ranked by how they hold up at enterprise scale

GitHub has accumulated five distinct authentication paths over the years, and not all of them are equivalent at enterprise volume:

Method Rate limit Scope Lifecycle Use when
Unauthenticated 60/hour per IP Public read only None Never in production
Personal access token (PAT) 5,000/hour User-level scopes Dies with user offboarding One-off scripts
OAuth App 5,000/hour per app User scopes via OAuth flow App-level User-facing third-party tools
GitHub App (installation token) 5,000-15,000/hour Installation-scoped fine-grained App-level, survives users Enterprise integrations
GITHUB_TOKEN in Actions 1,000-15,000/hour per repo Workflow-scoped Per-run CI/CD inside Actions

The non-obvious entry in this table is the GitHub Apps row. The 5,000-to-15,000 range is not a single number; it is a function of two things. GHES installations start at 5,000 per hour as the baseline. GHEC installations start at 15,000 per hour. On top of that baseline, both add 50 requests per hour per repository beyond the first 20 and 50 per hour per user beyond the first 20, capped at 12,500 on GHES or 15,000 on GHEC. A GitHub App installed on a GHEC organization with 400 repos and 200 users gets the cap. A GitHub App installed on a GHES instance with 30 repos and 25 users gets 5,000 + (10 × 50) + (5 × 50) = 5,750.

The platform engineer at the conference had 300 engineers and several hundred repositories but was on GHES, so their effective cap was the GHES 12,500 ceiling minus whatever the secondary CPU-time limits had eaten that morning. The 15,000 number they remembered was the GHEC ceiling, which they did not have.

Rate limit math, with the numbers the docs do not put on one page

The primary rate limits sit in the official rate limits page, but the secondary limits are scattered. The full set worth knowing on day one:

Limit Trigger What you see
Primary, per identity 5K/15K calls/hour 403 with x-ratelimit-remaining: 0
Concurrent requests More than 100 in flight 429 or 403
Per-endpoint points More than 900 points/minute 403 with Retry-After header
CPU time More than 90 seconds CPU per 60 seconds wall 403
Content-creating writes 80/minute or 500/hour 403
OAuth token requests 2,000/hour 403
Search API 10/minute unauthenticated, 30/minute authenticated 403

The two most common surprise hits are the per-endpoint-points limit and the content-creating limit. The points limit treats each REST endpoint as having its own per-minute bucket, so hammering one endpoint with thousands of read calls per minute (say, /repos/{owner}/{repo}/issues/comments) trips the points limit even when the global hourly counter has plenty of headroom. The content-creating limit hits bots that auto-file issues, post status checks, or open pull requests in bulk; 500 per hour is tight if you are syncing state from an external tracker.

The X-RateLimit-* response headers report all of this. The four primary ones are X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Used, and X-RateLimit-Reset (Unix timestamp). The hidden one to instrument is X-RateLimit-Resource, which tells you which bucket the call was billed against (core, search, code_search, graphql, integration_manifest, code_scanning_upload, actions_runner_registration, dependency_snapshots). When you see throttling, the first question is which bucket and the answer comes from that header.

The audit log API is the entire $21-per-seat reason

The single most operationally valuable Enterprise-only endpoint is the audit log. On GHEC the path is /enterprises/{ENTERPRISE}/audit-log, takes a phrase query parameter for filtering, and returns a paginated list of events. The event types worth piping into a SIEM are:

  • org.sso_response_succeeded and org.sso_response_failed (SSO login attempts)
  • repo.add_member / repo.remove_member / repo.change_member_permission
  • org.update_member_repository_creation_permission
  • personal_access_token.access_granted and personal_access_token.access_revoked
  • oauth_authorization.create and oauth_authorization.destroy
  • protected_branch.update and protected_branch.destroy
  • repo.transfer and repo.destroy

Each event carries an ISO 8601 timestamp, the actor login, the actor IP address, the repository or org context, and event-specific metadata. The data is dense enough that a brand-new GHEC tenant generates hundreds of events per active engineer per week. Plan storage accordingly: a 500-seat enterprise typically produces 50,000-200,000 audit events per month, which is small for object storage but non-trivial if you intend to query against it in a relational database without partitioning.

The non-obvious gotcha is the API's freshness lag. Events appear in the audit log API within minutes of occurring in most cases, but high-volume buckets (Actions workflow runs, dependency graph updates) can lag by 30-60 minutes. SIEM rules that expect sub-minute event arrival need to be tuned against the actual lag distribution in your tenant.

Pricing-wise: at $21 per user per month for the first 12 months and standard pricing thereafter, a 500-seat enterprise pays $126,000 in year one. The audit log endpoint is the single feature most often cited by compliance and security teams as the load-bearing justification for that line item.

Handling rate limits in code

Production GitHub integrations end up writing two reusable pieces of glue: a header parser that watches X-RateLimit-* and Retry-After, and a queue that throttles outbound calls when remaining capacity is low. The minimum useful Python shape using requests:

import time
import requests

class GitHubRateLimiter:
    def __init__(self):
        self.remaining = {}
        self.reset = {}

    def update(self, response: requests.Response) -> None:
        resource = response.headers.get("X-RateLimit-Resource", "core")
        remaining = int(response.headers.get("X-RateLimit-Remaining", "5000"))
        reset = int(response.headers.get("X-RateLimit-Reset", "0"))
        self.remaining[resource] = remaining
        self.reset[resource] = reset

    def should_wait(self, resource: str = "core", floor: int = 100) -> int:
        if self.remaining.get(resource, 5000) > floor:
            return 0
        return max(0, self.reset.get(resource, 0) - int(time.time()))

def call_api(url: str, token: str, limiter: GitHubRateLimiter) -> dict:
    resource = "search" if "/search/" in url else "core"
    wait = limiter.should_wait(resource)
    if wait > 0:
        time.sleep(min(wait, 300))

    headers = {
        "Authorization": f"Bearer {token}",
        "X-GitHub-Api-Version": "2026-03-10",
        "Accept": "application/vnd.github+json",
    }
    r = requests.get(url, headers=headers, timeout=30)
    limiter.update(r)

    if r.status_code == 403 and "rate limit" in r.text.lower():
        retry_after = int(r.headers.get("Retry-After", "60"))
        time.sleep(retry_after)
        return call_api(url, token, limiter)

    r.raise_for_status()
    return r.json()

Three points the snippet encodes:

  1. Track per-resource, not just globally. A search-heavy workload throttles independently of a core REST workload, and a single global counter hides which one is the bottleneck.
  2. Retry-After is in seconds. Unlike Meta's BUC header, GitHub's Retry-After follows the HTTP standard. The pitfall here is the opposite of Meta: developers who have been burned by minute-encoded values sometimes over-correct.
  3. The 403 with "rate limit" in the body is the secondary-limit signal. Primary limits return 403 with X-RateLimit-Remaining: 0. Secondary limits return 403 with a different body. Both are retryable, but secondary limits often want a longer backoff than the Retry-After value suggests, especially for the content-creating bucket.

EMU boundaries your sales engineer skipped

Enterprise Managed Users (EMU) is the GHEC option that puts user identity entirely under your IdP. The pitch is clean: every user account is provisioned via SCIM from Okta or Entra ID, the user signs in via SSO, the account dies when you remove them in the IdP, and the company controls the namespace. The gap between the pitch and the integration reality is where teams lose two weeks they did not budget.

The hard constraints on EMU accounts that affect integration design:

  • No personal repositories. Every repository must live under an organization in the enterprise. The /user/repos endpoint for creating a personal repo returns 403. A developer wanting a sandbox needs an organization-level "sandboxes" repo or a personal GitHub.com account separately.
  • No forks outside the enterprise. The fork API returns 403 if the target is not inside the enterprise. Workflows that depend on forking an upstream open-source project to contribute back will not work; the supported pattern is a separate non-EMU account for OSS contributions.
  • No cross-org commenting or starring. EMU users cannot comment on issues in github.com public repos or star them. This breaks Slack-to-GitHub integrations that try to react to a developer's comment on an external project.
  • Username transformation. EMU usernames are suffixed with _short_code derived from the enterprise short code. Integrations that pattern-match GitHub usernames (e.g., Slack handles, IdP attributes) need to know about the suffix.
  • No GitHub.com personal account merge. If a developer already has a GitHub.com account, EMU does not merge or link to it. The two accounts are separate entities.

The integration design rule: if any product feature depends on cross-org GitHub activity (mentions in OSS, profile views, public contributions), it cannot rely solely on EMU accounts. Either route those features through a non-EMU service account or design around the boundary.

GHES version skew and the X-GitHub-Api-Version header

GitHub Enterprise Server ships a new version roughly every 3 months and supports each version for about 12 months. In 2026 production deployments, the realistic version distribution among customers looks like this:

GHES version Released Common in production?
3.8 2025 H2 Yes (newest)
3.7 2025 H1 Yes
3.6 2024 H2 Yes
3.5 2024 H1 Some
3.4 and earlier 2023 and before Less common, but still seen

Endpoints added in GHEC propagate into GHES with a lag. The audit log Streaming API is GHEC-only. SCIM enhancements often land on GHEC two GHES releases before they appear on a tagged GHES build. Code that targets a customer fleet across this version range has to do two things.

First, on every REST call, send the X-GitHub-Api-Version header. Without it, GitHub defaults to 2022-11-28 which is supported through 2028-03-10. The current 2026-03-10 version is the better target if your customer instances are recent. The choice matters because newer API versions have breaking changes (renamed fields, removed deprecated endpoints) that the default does not surface.

Second, at the start of each integration session against a GHES instance, call GET /meta to retrieve the installed GHES version. Branch your code on that version rather than assuming the latest. The installed_version field on the response is what you key off.

GraphQL has its own versioning model. Mutations and types deprecated by GitHub remain available for at least 12 months after deprecation, and clients see the @deprecated annotation in the introspection schema. If your code uses the GraphQL Code Generator or a similar tooling that pins to a schema snapshot, refresh the snapshot quarterly against the latest GHES version your customer base supports.

REST vs GraphQL on enterprise-scale orgs

The decision rule for REST versus GraphQL on GitHub is more nuanced at enterprise scale than the docs imply. The point system is the key. GraphQL bills each query at one point for a read and five points for a mutation, with a ceiling of 5,000 points per hour. A REST equivalent counts as one call per request regardless of complexity.

For large orgs, the calculus often flips toward GraphQL on read-heavy workflows because a single GraphQL query can pull data that would otherwise require dozens of REST calls. Example: fetching a repository's recent issues with their comments, labels, and assignees takes 1+N REST calls (1 list, N detail) but a single GraphQL query. At 5,000 issues, REST burns 5,001 calls; GraphQL burns 1 point. GraphQL wins.

For write-heavy workflows, the calculus reverses. Mutations cost 5 points each in GraphQL versus 1 call in REST. Bulk-creating 1,000 issues in REST costs 1,000 calls (well within hourly limits but tripping the content-creating sub-limit of 80/min, 500/hour). In GraphQL, 1,000 mutations cost 5,000 points which is the hourly ceiling, plus the content-creating sub-limit still applies. REST is usually better for high-volume writes because the per-request overhead is smaller and the points cap is not the bottleneck.

The other practical difference is error handling. REST errors are HTTP status codes with JSON bodies. GraphQL always returns 200 with errors embedded in the response body's errors array, and a single query can have partial success — some fields populated, others with errors. Integration code needs separate paths for "REST 4xx response" and "GraphQL 200 response with errors array."

What to do before you ship against GHEC or GHES

Concrete checklist for an enterprise GitHub integration moving from prototype to production:

  1. Pick GitHub Apps over PATs from day one. The migration cost from PATs later is high (re-issuing tokens, re-coding auth flows, re-onboarding users). Standing up a GitHub App during prototyping costs roughly two hours of additional setup and saves the migration entirely.

  2. Pin X-GitHub-Api-Version on every REST call. Choose a date string explicitly rather than relying on the default. When GitHub eventually retires the 2022-11-28 default, code without an explicit header silently shifts to the next-oldest version.

  3. Instrument all six X-RateLimit-* headers from the first request. Including X-RateLimit-Resource is what distinguishes "we are out of rate limit budget" from "we are throttled on this specific resource bucket."

  4. Plan for both GHEC and GHES if your customer base spans both. Feature-detect rather than version-detect. Run integration tests against the GHES versions your customer fleet uses, not just the latest.

  5. Budget for the audit log API ingestion if you sell into regulated industries. A 500-seat customer generates 50K-200K events per month; that scales the design of your ingestion buffer, partitioning strategy, and retention policy.

  6. Map your data and user model to the EMU boundary before sales calls. Selling an integration that breaks on EMU accounts after the customer has rolled out EMU is the kind of mistake that loses a six-figure annual contract.

  7. Use GraphQL for read-heavy bulk fetches, REST for write-heavy bulk writes. The point-based ceiling and the per-request overhead push in opposite directions; matching the protocol to the workload shape changes effective throughput by an order of magnitude.

For the consolidated reference on rate limits, auth methods, and Enterprise-only endpoints, the GitHub Enterprise API tool page holds the version-current summary. For the broader developer-tools API directory, the developer tools category lists adjacent and complementary services.

Share this article

Article overview

Before you move on

Category
Tool Reviews
Read time
13 min
Mentioned tools
0
Back to all articles →

Next step

Finished reading? Continue comparing tools in the directory.

Browse tools