[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[Fwd: Re: Log do Squid, como analizar]



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

~    6.6 /access.log/

Most log file analysis program are based on the entries in
/access.log/. Currently, there are two file formats possible for the
log file, depending on your configuration for the /emulate_httpd_log/
option. By default, Squid will log in its native log file format. If
the above option is enabled, Squid will log in the common log file
format as defined by the CERN web daemon.

The common log file format contains other information than the native
log file, and less. The native format contains more information for
the admin interested in cache evaluation.


~      /The common log file format/

The Common Logfile Format
<http://www.w3.org/Daemon/User/Config/Logging.html#common-logfile-format>
is used by numerous HTTP servers. This format consists of the
following seven fields:

~        remotehost rfc931 authuser [date] "method URL" status bytes

It is parsable by a variety of tools. The common format contains
different information than the native log file format. The HTTP
version is logged, which is not logged in native log file format.


~      /The native log file format/

The native format is different for different major versions of Squid.
For Squid-1.0 it is:

~        time elapsed remotehost code/status/peerstatus bytes method URL

For Squid-1.1, the information from the /hierarchy.log/ was moved into
/access.log/. The format is:

~        time elapsed remotehost code/status bytes method URL rfc931
peerstatus/peerhost type

For Squid-2 the columns stay the same, though the content within may
change a little.

The native log file format logs more and different information than
the common log file format: the request duration, some timeout
information, the next upstream server address, and the content type.

There exist tools, which convert one file format into the other.
Please mind that even though the log formats share most information,
both formats contain information which is not part of the other
format, and thus this part of the information is lost when converting.
Especially converting back and forth is not possible without loss.

/squid2common.pl/ is a conversion utility, which converts any of the
squid log file formats into the old CERN proxy style output. There
exist tools to analyse, evaluate and graph results from that format.


~      /access.log native format in detail/

It is recommended though to use Squid's native log format due to its
greater amount of information made available for later analysis. The
print format line for native /access.log/ entries looks like this:

~    "%9d.%03d %6d %s %s/%03d %d %s %s %s %s%s/%s %s"

Therefore, an /access.log/ entry usually consists of (at least) 10
columns separated by one ore more spaces:

*time*

~    A Unix timestamp as UTC seconds with a millisecond resolution. You
~    can convert Unix timestamps into something more human readable
~    using this short perl script:

~        #! /usr/bin/perl -p
~        s/^\d+\.\d+/localtime $&/e;

*duration*

~    The elapsed time considers how many milliseconds the transaction
~    busied the cache. It differs in interpretation between TCP and UDP:

~        * For HTTP/1.0, this is basically the time between /accept()/
~          and /close()/.
~        * For persistent connections, this ought to be the time
~          between scheduling the reply and finishing sending it.
~        * For ICP, this is the time between scheduling a reply and
~          actually sending it.

~    Please note that the entries are logged /after/ the reply finished
~    being sent, /not/ during the lifetime of the transaction.

*client address*

~    The IP address of the requesting instance, the client IP address.
~    The /client_netmask/ configuration option can distort the clients
~    for data protection reasons, but it makes analysis more difficult.
~    Often it is better to use one of the log file anonymizers.

~    Also, the /log_fqdn/ configuration option may log the fully
~    qualified domain name of the client instead of the dotted quad.
~    The use of that option is discouraged due to its performance impact.

*result codes*

~    This column is made up of two entries separated by a slash. This
~    column encodes the transaction result:

~       1. The cache result of the request contains information on the
~          kind of request, how it was satisfied, or in what way it
~          failed. Please refer to section Squid result codes
~          <#cache-result-codes> for valid symbolic result codes.

~          Several codes from older versions are no longer available,
~          were renamed, or split. Especially the /ERR_/ codes do not
~          seem to appear in the log file any more. Also refer to
~          section Squid result codes <#cache-result-codes> for details
~          on the codes no longer available in Squid-2.

~          The NOVM versions and Squid-2 also rely on the Unix buffer
~          cache, thus you will see less /TCP_MEM_HIT/s than with a
~          Squid-1. Basically, the NOVM feature relies on /read()/ to
~          obtain an object, but due to the kernel buffer cache, no
~          disk activity is needed. Only small objects (below 8KByte)
~          are kept in Squid's part of main memory.

~       2. The status part contains the HTTP result codes with some
~          Squid specific extensions. Squid uses a subset of the RFC
~          defined error codes for HTTP. Refer to section status codes
~          <#http-status-codes> for details of the status codes
~          recognized by a Squid-2.

*bytes*

~    The size is the amount of data delivered to the client. Mind that
~    this does not constitute the net object size, as headers are also
~    counted. Also, failed requests may deliver an error page, the size
~    of which is also logged here.

*request method*

~    The request method to obtain an object. Please refer to section
~    request-methods <#request-methods> for available methods. If you
~    turned off /log_icp_queries/ in your configuration, you will not
~    see (and thus unable to analyse) ICP exchanges. The /PURGE/ method
~    is only available, if you have an ACL for ``method purge'' enabled
~    in your configuration file.

*URL*

~    This column contains the URL requested. Please note that the log
~    file may contain whitespaces for the URI. The default
~    configuration for /uri_whitespace/ denies whitespaces, though.

*rfc931*

~    The eigth column may contain the ident lookups for the requesting
~    client. Since ident lookups have performance impact, the default
~    configuration turns /ident_loookups/ off. If turned off, or no
~    ident information is available, a ``-'' will be logged.

*hierarchy code*

~    The hierarchy information consists of three items:

~       1. Any hierarchy tag may be prefixed with /TIMEOUT_/, if the
~          timeout occurs waiting for all ICP replies to return from
~          the neighbours. The timeout is either dynamic, if the
~          /icp_query_timeout/ was not set, or the time configured
~          there has run up.
~       2. A code that explains how the request was handled, e.g. by
~          forwarding it to a peer, or going straight to the source.
~          Refer to section hier-codes <#hier-codes> for details on
~          hierarchy codes and removed hierarchy codes.
~       3. The IP address or hostname where the request (if a miss) was
~          forwarded. For requests sent to origin servers, this is the
~          origin server's IP address. For requests sent to a neighbor
~          cache, this is the neighbor's hostname. NOTE: older versions
~          of Squid would put the origin server hostname here.

*type*

~    The content type of the object as seen in the HTTP reply header.
~    Please note that ICP exchanges usually don't have any content
~    type, and thus are logged ``-''. Also, some weird replies have
~    content types ``:'' or even empty ones.

There may be two more columns in the /access.log/, if the (debug)
option /log_mime_headers/ is enabled In this case, the HTTP request
headers are logged between a ``['' and a ``]'', and the HTTP reply
headers are also logged between ``['' and ``]''. All control
characters like CR and LF are URL-escaped, but spaces are /not/
escaped! Parsers should watch out for this.



Srinath M. wrote:

| O "SARG" não resolveria o seu problema ?
|
| Hélio Poffo Junior - A Notícia wrote:
|
|> Bom dia povo, estou tentando desenvolver uma ferramenta caseira
|> para analize do log
|> do Squid, porem estou com dificuldade quanto a analize de alguns
|> 'campos'.
|>
|> Assim:
|>
|> 1072176280.356     29 192.168.1.1 TCP_IMS_HIT/304 269 GET
|> http://www.site.com.br/index.php usuario NONE/- text/plain
|>        1                        2            3                4
|> 5      6                                7
|> 8            9
|> 10
|>
|> Alguem sabe como que funciona a contagem de tempo? (1)
|> O que seria aquele 29? (2)
|> A documentacao do squid mostra quais sao os tipo de TCP? (4)
|> O que seria aquele 269? (5)
|> NONE? (9)
|>
|> Alguma ideia?
|>
|> []'s
|>
|>
|> Hélio José Poffo Junior
|> Analista de Suporte
|>
|> ------------------------------------------------
|> Debian GNU/Linux 3.0 (woody) 2.4.20
|> Linux User #196175
|>
|> A Notícia Empresa Jornalística S/A
|> http://www.an.com.br
|>
|>
|>
|>
|
|



- --
+--------------------------------------------------------------+
|  Fabiano F. Siqueira                fabiano@dnconect.com.br  |
|  DN Conectividade                   Analista de Suporte      |
+--------------------------------------------------------------+
|  Public GPG KeyID                   C0AD7129                 |
|  Keyserver                          wwwkeys.eu.pgp.net       |
+--------------------------------------------------------------+
Key fingerprint = 34BF 32F2 4E71 90D0 1459  69FE B231 F7C7 C0AD 7129
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQE/6HVOsjH3x8CtcSkRAiHNAJ9/1QVfp84p+YjCcg0hmzc+JsIwnQCfX5A9
BQrtX8ykxPa1+cBBKK6ldUs=
=eheu
-----END PGP SIGNATURE-----



Reply to: