Skip to main content
Version: development

Caching

Overview

Caching is a critical technology that significantly speeds up application performance by storing frequently accessed data in memory for quick retrieval. This approach reduces the need to repeatedly fetch data that might be slow to process. A common example is web caching, where content from often-visited web pages is stored and served to users. This method avoids the repeated loading of the entire webpage, thereby enhancing the user experience and application performance.

Caching can be integrated into various application layers, for example, the database layer. By caching frequently requested data, it reduces the load on the database and enables faster request processing. In the context of generative AI and Large Language Models (LLMs), where computations can be intensive and costly, effective caching is essential. It facilitates rapid request processing, leading to significant cost savings and performance improvements.

flowchart LR classDef Orange fill:#F8773D,stroke:#000000,stroke-width:2px; classDef Green fill:#56AE89,stroke:#000000,stroke-width:2px; SDK class SDK Green subgraph Aperture_Cloud ["Aperture Cloud"] Caching class Caching Orange end class Aperture_Cloud Green SDK -- "Set/Get Response" --> Caching

The diagram above shows how developers using the Aperture SDK can connect to Aperture Cloud, to set the cache or lookup a stored response before processing an incoming request.

Pre-Requisites

Before exploring Aperture's caching capabilities, make sure that you have signed up to Aperture Cloud and set up an organization. For more information on how to sign up, follow our step-by-step guide.

Caching with Aperture SDK

The first step to using the Aperture SDK is to import and set up Aperture Client:


You can obtain your organization address and API Key within the Aperture Cloud UI by clicking the Aperture tab in the sidebar menu.

The next step is making a startFlow call to Aperture. For this call, it's crucial to designate the control point (caching-example in our example) and the resultCacheKey, which facilitates access to the cache in Aperture Cloud. Additionally, to obtain detailed telemetry data for each Aperture request, include the labels related to business logic.


After making a startFlow call, check for cached responses in Aperture Cloud using flow.resultCache().getLookupStatus() matching it to (LookupStatus.Hit). Otherwise, in the case of a cache miss, developers can store a new response in the cache. This is where setting the ttl (Time to Live) becomes important, as it dictates how long the response will be stored in the cache. A longer TTL is ideal for stable data that doesn't change often, ensuring it's readily available for frequent access. Conversely, a shorter TTL is more suitable for dynamic data that requires regular updates, maintaining the cache's relevance and accuracy. It is important to make the end call after processing each request to send telemetry data that would provide granular visibility for each flow.


Caching in Action

Begin by cloning the Aperture JS SDK.

Switch to the example directory and follow these steps to run the example:

  1. Install the necessary packages:
    • Run npm install to install the base dependencies.
    • Run npm install @fluxninja/aperture-js to install the Aperture SDK.
  2. Run npx tsc to compile the TypeScript example.
  3. Run node dist/caching_example.js to start the compiled example.

Once the example is running, it will prompt you for your Organization address and API Key. In the Aperture Cloud UI, select the Aperture tab from the sidebar menu. Copy and enter both your Organization address and API Key to establish a connection between the SDK and Aperture Cloud.

Aperture will cache and serve the response for the duration specified by the TTL. Once the TTL expires, and the cache lookup returns a miss, Aperture will reset the response in the cache.

Using Aperture's caching feature, developers can enhance application performance by storing commonly requested data, thereby reducing system load.