Squid Web Cache wiki

Squid Web Cache documentation

🔗 Feature: Add-On Helpers for Request Manipulation

🔗 Details

Every network and installation have their own criteria for operation. The squid developers and community do not have the time or inclination to write code for every minor situation. Instead we provide ways to easily extend various operations with local add-on scripts or programs we call helpers.

🔗 What language are helper meant to be written in?

Helpers can be written in any language you like. They can be executable programs or interpreted scripts.

The helpers bundled with Squid are currently written in Bash shell script, awk script, perl script, and C++. There are also frameworks available for helpers built in Python or Ruby.

:warning: There are several languages which encounter difficulties though:

🔗 How do the helpers communicate with Squid?

The interface with Squid is very simple. The helper is passed a limited amount of information on stdin to perform their expected task. The result is passed back to Squid via stdout. With any errors or debugging traces sent back on stderr.

See the particular interface protocols below for details about the line syntax the helper is expected to receive and send on each interface.

🔗 Why is my helper not starting up with Squid?

Squid-3.2 and newer support dynamic helper initialization. That means the helper is only started if it needs to be. If Squid is configured with startup=N value greater than 0 you can expect that many of your helper to be started when Squid starts. But this is not necessarily a desirable thing for Squid needing fast startup or restart times.

With startup=0 configured the first HTTP request through Squid is expected to start at least one instance for most of the helpers. But if for example an external ACL is configured and is only tested on rare occasions its helper will not be started until that rare occasion happens for the first time.

🔗 What happens when Squid shuts down or reconfigures?

When shutting down, reconfiguring, or in other times Squid needs to shutdown a helper Squid schedules closure of the stdin connection of the helper. When all the in-progress lookups are completed the helper should receive this close signal when reading stdin.

Shutting down or restarting are limited by the shutdown_timeout which may cause Squid to abort earlier than receiving all the responses. If this happens the client connections are also being terminated just as abruptly as the helper - so the lost helper responses are not an issue.

🔗 Can I write a helper that talks to Squid on more than one interface?

You can. In a way.

Squid runs the configured helper for each interface as a separate child process. Your helper can be written to detect other running instances of itself and communicate between them, effectively sharing memory and/or state outside of Squid regardless of the interface Squid is using to run each instance.

NP: Just keep in mind that the number of instances (children) running on each interface is configurable and could be anything from zero to many hundreds. So do not make any assumptions about which interface another instance is running on.

🔗 Squid operations which provide a helper interface

Squid-2.6 and later all support:

Squid-2.7 and Squid-3.1+ support:

Squid-3.1+ support:

Squid-3.4+ support:

squid-3.5+ support:

Squid-3.1 and later also support eCAP plugins and ICAP services which differ from helper scripts in many ways.

🔗 Helper states

An individual helper process may be in one or more of the following states:

Key Name Meaning
B BUSY Squid is expecting a response from the helper process
W WRITING Squid is sending one or more requests to a stateless helper process. Squid has not been notified that all the sent data has been written. A WRITING helper is a BUSY helper. Please note that reporting this state is currently not supported for stateful helpers.
R RESERVED Squid is sending a request to a stateful helper process. Squid has not been notified that all the sent data has been written.
P PLACEHOLDER There is at least one master transaction waiting for this stateful helper (but not necessarily this specific stateful helper process) to become available (i.e. not BUSY).
C CLOSING Squid closed its writing socket for the helper process, but the helper has not quit yet (or, to be more precise, has not closed its stdout yet).
S SHUTDOWN PENDING Squid marked this helper process for eventual closure but has not yet initiated that closure (usually because the helper is still BUSY).

The above table does not reflect some esoteric corner cases, especially when it comes for conditions for ending a helper state. For example, a stateful helper process may stop being RESERVED for reasons other than writing the entire request data to the helper process.

Squid Cache Manager reports individual helper states on helper-specific pages such as mgr:store_io.

🔗 Helper protocols

:information_source: Squid-2.6 and later all support concurrency, however the bundled helpers and many third-party commercial helpers do not. This is changing, the use of concurrency is encouraged to improve performance. The relevant squid.conf concurrency setting must match the helper concurrency support. The helper multiplexer wrapper can be used to add concurrency benefits to most non-concurrent helpers.

🔗 Key=Value pairs (kv-pairs) format

The interface for all helpers has been extended to support arbitrary lists of key=value pairs, with the syntax ` key=value . Some keys have special meaning to Squid, as documented here. All messages from squid are URL-escaped (the rfc1738_unescape ` from rfc1738.h can be used to decode them. For responses, the safe way is to either URL-escape, or to enclose the value in double_quotes (“); any double-quotes or backslashes (\) in the value need to be prefixed by a backslash, \r and \n are replaced respectively by CR and LF

Some example key values:

                user=John%20Smith
                user="John Smith"
                user="J. \"Bob\" Smith"

🔗 URL manipulation

Input line received from Squid:

[channel-ID] URL [key-extras]

🔗 HTTP Redirection

Redirection can be performed by helpers on the url_rewrite_program interface. Lines performing either redirect or re-write can be produced by the same helpers on a per-request basis. Redirect is preferred since re-writing URLs introduces a large number of problems into the client HTTP experience.

The input line received from Squid is detailed by the section above.

Redirectors send a slightly different format of line back to Squid.

Result line sent back to Squid:

[channel-ID] [result] [kv-pairs] [status:URL]

🔗 URL Re-Writing (Mangling)

URL re-writing can be performed by helpers on the url_rewrite_program, storeurl_rewrite_program and location_rewrite_program interfaces.

WARNING: when used on the url_rewrite_program interface re-writing URLs introduces a large number of problems into the client HTTP experience. Some of these problems can be mitigated with a paired helper running on the location_rewrite_program interface de-mangling the server redirection URLs.

Result line sent back to Squid:

[channel-ID] [result] [kv-pair] [URL]

🔗 Store ID de-duplication

URL to Store-ID mapping can be performed by helpers on the storeid_rewrite_program interface.

WARNING: care must be taken that the URLs de-duplicated onto one shared ID are actually duplicates. Clients needing to revalidate will cause the cached object to be sourced from either of the duplicate locations. If they are not real duplicates this can randomly cause major issues with the client experience.

Result line sent back to Squid:

[channel-ID] result kv-pair

🔗 Authenticator

🔗 Basic Scheme

Input line received from Squid:

[channel-ID] username password [key-extras]

Result line sent back to Squid:

[channel-ID] result [kv-pair]

🔗 Bearer Scheme

Input line received from Squid:

channel-ID b64token [key-extras]

Result line sent back to Squid:

channel-ID result [kv-pair]

🔗 Digest Scheme

Input line received from Squid:

[channel-ID] "username":"realm" [key-extras]

{i} The username and realm strings are both double quoted () and separated by a colon (:) as shown above.

Result line sent back to Squid:

[channel-ID] [result] [kv-pair] [hash]

🔗 Negotiate and NTLM Scheme

Input line received from Squid:

 request [credentials] [key-extras]

Result line sent back to Squid:

 result [token label] [kv-pair] [message]

🔗 Access Control (ACL)

This interface has a very flexible field layout. The administrator may configure any number or order of details from the relevant HTTP request or reply to be sent to the helper.

Input line received from Squid:

[channel-ID] format-options [acl-value [acl-value ...]]

Result line sent back to Squid:

[channel-ID] result [kv-pair]

🔗 Logging

Squid sends a number of commands to the log daemon. These are sent in the first byte of each input line:

No response is expected. Any response that may be desired should occur on stderr to be viewed through cache.log.

🔗 SSL certificate generation

This interface has a fixed field layout.

Input line received from Squid:

request size kv-pairs [body]

/!\ line refers to a logical input. body may contain \n characters so each line in this format is delimited by a 0x01 byte instead of the standard \n byte.

Result line sent back to Squid:

result size [kv-pairs] body

🔗 SSL server certificate validator

This interface is similar to the SSL certificate generation interface.

Input line received from Squid:

request size [kv-pairs]

/!\ line refers to a logical input. body may contain \n characters so each line in this format is delimited by a 0x01 byte instead of the standard \n byte.

Example request:

0 cert_validate 1519 host=dmz.example-domain.com
cert_0=-----BEGIN CERTIFICATE-----
MIID+DCCA2GgAwIBAgIJAIDcHRUxB2O4MA0GCSqGSIb3DQEBBAUAMIGvMQswCQYD
...
YpVJGt5CJuNfCcB/
-----END CERTIFICATE-----
error_name_0=X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT
error_cert_0=cert0

Result line sent back to Squid:

result size kv-pairs

Example response message:

ERR 1444 cert_10=-----BEGIN CERTIFICATE-----
MIIDojCCAoqgAwIBAgIQE4Y1TR0/BvLB+WUF1ZAcYjANBgkqhkiG9w0BAQUFADBr
...
398znM/jra6O1I7mT1GvFpLgXPYHDw==
-----END CERTIFICATE-----
error_name_0=X509_V_ERR_DEPTH_ZERO_SELF_SIGNED_CERT
error_reason_0=Checked by Cert Validator
error_cert_0=cert_10

🔗 Cache file eraser

The unlink() function used to erase files is a blocking call and can slow Squid down. This interface is used to pass file erase instructions to a helper program specified by unlinkd_program.

This interface has a fixed field layout. As of Squid-3.3 this interface does not support concurrency. It requires Squid to be built with –enable-unlinkd and only cache storage types which use disk files (UFS, AUFS, diskd) use this interface.

Input line received from Squid:

path

Result line sent back to Squid:

result [kv-pair]

CategoryFeature

Categories: Feature

Navigation: Site Search, Site Pages, Categories, 🔼 go up