Lifecycle of a Kernel

Kernel initialization is the process of setting up cloud kernels—including kernel variables, code paths, options and other states—prior to running user code in them. It also involves configuration of sandboxing mechanisms such as the kernel protected mode (AKA sandbox mode) and the J/Link cloud security manager.
In Wolfram Enterprise Private Cloud (EPC), webKernel is responsible for the initialization and management of cloud kernels.

Package Loading

To ensure that the correct versions of Wolfram Language packages are loaded into cloud kernels, the $Path variable and the paclet system must be managed correctly. The codebase for any type of cloud kernel consists of multiple layers of packages and paclets:
◼
  • General-purpose paclets in a managed paclets directory. Rather than having each cloud user download separate copies of Wolfram-published paclets, select paclets are installed in a shared directory, and the paclet manager in each kernel is pointed to this location.
  • ◼
  • Custom paclets specified by the user. These are placed in the $Path via $BaseDirectory/Kernels/init.m and $UserBaseDirectory/Kernels/init.m. More information on these path elements is in Chapter 3, “Setup.”
  • ◼
  • WolframApplications paclets in a shared directory. This includes some paclets that are only found in cloud kernels, e.g. supporting web graphics, server-side Wolfram Language code, etc. Also located in this directory are cloud-specific versions of paclets that are found in the Wolfram Engine layout but are updated more frequently than Wolfram Engine versions are released (e.g. CloudObjects, Forms, Templating). These are visible to the user’s Wolfram Language evaluations.
  • ◼
  • Web app packages in webMathematica’s WEB-INF/Applications directory. These are typically cloud-specific packages that cannot be read from the user’s Wolfram Language evaluations.
  • ◼
  • Wolfram Engine layout. Each kernel runs a release build of the Wolfram Engine. These are visible to the user’s Wolfram Language evaluations.
  • Sandboxing and Security Isolation

    One top priority of the kernel initialization process is setting up various security mechanisms. These address the standard security concerns for internet-connected web products:
    ◼
  • Preventing attacks on the system and other users
  • ◼
  • Isolating users from each other (ensure privacy, prevent denial of service)
  • ◼
  • Isolating a user from internal/system resources (ensure protection of proprietary information, prevent denial of service)
  • The main security mechanisms used by the cloud are:
    ◼
  • The Wolfram Engine kernel sandbox. Also known as protected mode, this security feature can be activated to set allowlist-based restrictions on user file access. If the file path is not on the appropriate allowlist, the kernel responds as if the file does not exist.
  • ◼
  • The J/Link cloud security manager. Similarly to the kernel sandbox, this application includes a set of allowlists for restrictions to read/write access, as well as for allowed native Java libraries.
  • Both of these security layers—the kernel sandboxing mechanism and the J/Link cloud security manager—can be disabled when needed, especially in environments where all users of EPC are trusted, or if sandboxing interferes with RLink, for example. To toggle these security features, please refer to slide 8 of the configuration notebook. In the section on the kernel sandbox, set the option "UseSandbox" to False. Further down that same slide, in the section on the JLink sandbox, set the option "JLinkUseSandbox" to False.

    Phases of Initialization

    A cloud kernel goes through several distinct phases of initialization. To understand these phases, it is helpful to realize that the basic lifecycle of any kernel in the cloud is managed by webKernel. In webKernel, each pool has a fixed number of kernels, and the application may acquire a kernel to perform evaluations and then release it when done. When a kernel is acquired, its behavior is determined by its scope, which is either request scope or session scope. A session-scoped kernel (described in Chapter 4, “Allocating Kernel Resources,” Session Kernel Pools) is terminated when it is released, while general kernels return to the pool until they are needed again.
    A cloud kernel goes through several distinct phases of initialization.

    Kernel Creation

    A kernel is created either to fill a pool when Tomcat starts or to replace another kernel that has been destroyed.
    There are four key places in the code where the kernel is initialized, starting at the point of its creation:
    ◼
  • MSPConfiguration.xml, the KernelInitialization parameter for the pool. This file is generated from a template using properties configured for the particular cloud installation (e.g. private cloud, public cloud). It is kept to the bare essentials, receiving configuration parameters and setting variables such as CloudSystem`Private`$Pool to hold the pool type; $EvaluationEnvironment; $CloudBase, a variable that points to the WolframApplications directory; and a few other variables. The last thing it does is load KernelInitialize.
    ​
    The emphasis on this code is to keep it minimal. Anything that can be moved to KernelInitialize.m should be.
  • ◼
  • KernelInitialize from the CloudSystem package. This is the main body of kernel initialization. It sets up PacletManager, pre-loads numerous packages and paclets including CloudObject, seeds the kernel’s random number generator and disables functions that are not supported in the cloud. The last thing it does is load the CloudSystem package itself.
  • ◼
  • The CloudSystem package. This is mainly loaded for its function definitions that will be called later, including InitializeServer[] and ConfigureKernelForUser[].
  • ◼
  • CloudSystem`InitializeServer[] function from the CloudSystem package. This also pre-loads some libraries and packages.
  • User Configuration

    A significant amount of initialization takes place once the user is known. This includes things such as setting variables like $HomeDirectory and $TemporaryDirectory, setting the initial current working directory and other similar settings.
    The user configuration step is fairly detailed. It is responsible for setting variables, options and other settings that depend on knowing which user will use the kernel. This is where the security mechanisms like the kernel sandbox and J/Link security manager are configured.
    User Variables and Settings
    ◼
  • $HomeDirectory — set to the user’s cloud home directory
  • ◼
  • $UserBaseDirectory — this is located in the Base directory under the home directory
  • ◼
  • $TemporaryDirectory — in public kernels, this is a randomly generated directory for the kernel itself to provide isolation between requests; for session and scheduled task kernels, it is generally the same location for the same user
  • ◼
  • $EvaluationCloudObject — for cloud notebooks, this is set to the cloud object of the notebook hosting the evaluation; for public and scheduled task requests, it is the cloud object initiating the evaluation (e.g. the APIFunction or ScheduledTask or similar deployed object)
  • ◼
  • Directory[] — this is generally set to the home directory, although in the future it may be configurable for public and scheduled task evaluations
  • ◼
  • $CloudUserID, $CloudUserUUID — these are set as if an implicit CloudConnect[] had taken place
  • ◼
  • $GeoLocation — typically set either from GeoIP information from the requester’s IP address or for a session by the mobile app
  • Localized Variables
    ◼
  • $GeoLocationCountry — entity for the country of $GeoLocation
  • ◼
  • $GeoLocationCity — entity for the city nearest to $GeoLocation, if any
  • ◼
  • $TimeZone — this should be set for the location of EPC; additional configuration options will be added to the configuration notebook in an upcoming version
  • ◼
  • $DateStringFormat — this is set according to $GeoLocationCountry using code copied from the Wolfram|Alpha code base
  • ◼
  • (Future) $UnitSystem — this is set according to $GeoLocationCountry
  • ◼
  • (Future) $Language — this is set according to $GeoLocationCountry
  • Request Metadata

    For deployment kernels, it is essential that the code has access to metadata about the web request that initiated the evaluation. This information is also available for session kernels, though it is not often needed there. In task kernels, metadata is generally not available. Task kernels are, in this sense, similar to conventional (non-cloud) kernels, such as in the desktop application, which runs independently of any web server.
    ◼
  • HTTPRequestData[] — the underlying data to drive this function is provided to the kernel
  • ◼
  • $RequesterCloudUserID, $RequesterCloudUserUUID — the identity of the user making the web request, if known (i.e. if the request is authenticated); this is identical to the corresponding $CloudUserID and $CloudUserUUID
  • ◼
  • $UserAgentString and related symbols — set according to the user agent string and components parsed by a user agent string parser
  • ◼
  • $RequesterAddress — IP address where the web address originated