File/program/lib/waslib.php

Description

/program/lib/waslib.php - core functions

This file provides various utility routines.

Constants
CAPACITY_CHAIR = 8 (line 38)
CAPACITY_CUSTOM1 = 11 (line 41)
CAPACITY_CUSTOM2 = 12 (line 42)
CAPACITY_CUSTOM3 = 13 (line 43)
CAPACITY_CUSTOM4 = 14 (line 44)
CAPACITY_CUSTOM5 = 15 (line 45)
CAPACITY_CUSTOM6 = 16 (line 46)
CAPACITY_CUSTOM7 = 17 (line 47)
CAPACITY_CUSTOM8 = 18 (line 48)
CAPACITY_CUSTOM9 = 19 (line 49)
CAPACITY_EDITOR = 9 (line 39)
CAPACITY_MEMBER = 4 (line 34)
CAPACITY_NEXT_AVAILABLE = 20 (line 50)
CAPACITY_NONE = 0 (line 30)

The constants CAPACITY_* are used for group memberships (see accountmanagerlib.php).

CAPACITY_PRINCIPAL = 3 (line 33)
CAPACITY_PROJECTLEAD = 5 (line 35)
CAPACITY_PUBLISHER = 10 (line 40)
CAPACITY_PUPIL = 1 (line 31)
CAPACITY_SECRETARY = 7 (line 37)
CAPACITY_TEACHER = 2 (line 32)
CAPACITY_TREASURER = 6 (line 36)
QUASI_RANDOM_DIGITS = 10 (line 253)
QUASI_RANDOM_DIGITS_UPPER = 36 (line 255)
QUASI_RANDOM_DIGITS_UPPER_LOWER = 62 (line 256)
QUASI_RANDOM_HEXDIGITS = 16 (line 254)
Functions
appropriate_legal_notices (line 1524)

construct a link to appropriate legal notices as per AGPLv3 section 5

This routine constructs ready-to-use HTML-code for a link to the Appropriate Legal Notices, which are to be found in /program/about.html. Depending on the highvisibility flag we either generate a text-based link or a clickabel image.

The actual text / image to use depends on the global constant WAS_ORIGINAL. This constant is defined in /program/version.php and it should be TRUE for the original version of Website@School and FALSE for modified versions.

In the former case the anchor looks like 'Powered by Website@School', in the latter case it will look like 'Based on Website@School', which is in line with the requirements from the license agreement for Website@School, see /program/license.html.

IMPORTANT NOTE

Please respect the license agreement and change the definition of WAS_ORIGINAL to FALSE if you modify this program (see /program/version.php). You also should change the file '/program/about.html' and add a 'prominent notice' of your modifications.

Note: a comparable routine can be found in install.php.

  • return: ready-to-use HTML
string appropriate_legal_notices (bool $high_visibility, [string $m = ''])
  • bool $high_visibility: if TRUE we return a text-only link, otherwise a clickable image
  • string $m: margin to improve readability of generated code
build_tree (line 924)

construct a tree of nodes in memory

this reads the nodes in the specified area from disk and constructs a tree via linked lists (sort of). If parameter $force is TRUE, the data is read from the database, otherwise a cached version is returned (if available).

Note that this routine also 'repairs' the tree when an orphan is detected. The orphan is automagically moved to the top of the area. Of course, it shouldn't happen, but if it does we are better off with a magically _appearing_ orphan than a _disappearing_ node.

A lot of operations in the page manager work with a tree of nodes in some way, e.g. walking the tree and displaying it or walking the tree and collecting the sections (but not the pages), etc.

The tree starts with a 'root' with key 0 ($tree[0]). This is the starting point of the tree. The nodes at the top level of an area are linked from this root node via the field 'first_child_id'. If there are no nodes in the area, this field 'first_child_id' is zero. The linked list is constructed by using the node_id. All nodes in an area are collected in an array. This array us used to construct the linked lists.

Every node has a parent (via 'parent_id'), where the nodes at the top level have a parent_id of zero; this points to the 'root'. The nodes within a section or at the top level are linked forward via 'next_sibling_id' and backward via 'prev_sibling_id'. A zero indicates the end of the list. Childeren start with 'first_child_id'. A value of zero means: no childeren.

The complete node record from the database is also stored in the tree. This is used extensively throughout the pagemanager; it acts as a cache for all nodes in an area.

Note that we cache the node records per area. If two areas are involved, the cache doesn't work very well anymore. However, this doesn't happen very often; only in case of moving nodes from one area to another (and even then).

  • return: contains a 'root node' 0 plus all nodes from the requested area if any
  • todo: what if we need the trees of two different areas? should the static var here be an array, keyed by area_id?
  • todo: repairing a node doesn't really belong here, in this routine. we really should have a separate 'database repair tool' for this purpose. someday we'll fix this....
array build_tree (int $area_id, [bool $force = FALSE])
  • int $area_id: the area to make the tree for
  • bool $force: if TRUE forces reread from database (resets the cache)
calculate_uri_shortcuts (line 189)

try to eliminate the scheme and authority from the two main uri's

This tries to get rid of the scheme and the authority in 'www' and 'progwww', If these two elements are the same, it becomes possible to use a shorter form of the uri when referencing files in 'progwww' from 'www'.

If the scheme and the authority of 'www' and 'progwww' are the same, the returned strings contain only the path elements. If scheme and authority differ, they contain the same as 'www' and 'progwww' respectively.

Examples: www = 'http://www.example.com/site' and progwww = 'http://www.example.com/site/program' yields www_short = '' and wwwprog_short = '/program'.

www = 'http://www.example.com' and progwww = 'http://common.example.com/program' yields www_short idential to www and progwww identical to progwww_short.

The purpose is to be able to generate relative links, e.g. an image in /program/graphics/foo.jpg can be referred to like this

    <img src="{$CFG->progwww_short}/graphics/foo.jpg"> or <img src="/program/graphics/foo.jpg"> rather than <img src="http://www.example.com/program/graphics/foo.jpg">

Note that the comparison in this routine is notvery fancy, it can be easily fooled to consider scheme+authority to be different. However, since this routine is only used to compare two values from config.php, it's not likely to cause trouble.

  • return: the two short versions of www and progwww, if possible
array calculate_uri_shortcuts (string $www, string $progwww)
  • string $www: the uri (scheme / authority / path) of the directory holding config.php
  • string $progwww: the uri (scheme / authority / path) corresponding with the program directory
calc_user_related_acls (line 1162)

calculate an array with acls related to user $user_id via group memberships

this calculates the related acls for user $user_id. The results are returned as an array keyed by acl_id. It can containt 0 or more elements. The values of the array elements are groupname/capacity-pairs. This routine is referenced from both useraccount.class.php and usermanager.class.php.

  • return: 0, 1 or more acl_id => groupname/capacity pairs
array calc_user_related_acls (int $user_id)
  • int $user_id: the user we're looking at
capacity_name (line 1139)

translate a numeric capacity code to a readable name

this translates a capacity code into a readable name, e.g. as an item in a dropdown list when dealing with group memberships. The actual codes are defined as constants, e.g. CAPACITY_NONE.

  • return: readable name of capacity
string capacity_name (int $capacity)
  • int $capacity: numeric code of capacity
convert_to_type (line 1222)

convert a string to another type (bool, int, etc.)

  • return: the value $value casted to the proper type
  • todo: perhaps change the possible values of $type to full strings rather than 'cryptic' single letter codes. Furthermore: what do we do with invalid dates, times and date/times? For now it is a stub, returning $value as-is. Oh well.
mixed convert_to_type (string $type, string $value)
  • string $type: new type for $value: b=bool, i=integer, s=string, etc.
  • string $value: the value to convert to tye $type
cron_send_queued_alerts (line 810)

send pending messages/alerts

this goes through all the alert accounts to see if any messages need to be sent out by email. The strategy is as follows. First we collect a maximum of $max_messages alerts in in core (1 trip to the database) Then we iterate through that collection and for every alert we

  1. construct and send an email message
  2. update the record (reset the message buffer and message count) (+1 trip to the database)
Locking and unlocking would be even more expensive, especially when chances of race conditions are not so big. (An earlier version of this routine went to the database once for the list of all pending alerts and subsequently twice for each alert but eventually I considered that too expensive too).

Assuming that an UPDATE is more or less atomic, we hopefully can get away with an UPDATE with a where clause looking explicitly for the previous value of the message count. If a message was added after retrieving the alerts but before updating, the message count would be incremented (by the other process) which would prevent us from updating. The alert would be left unchanged but including the added message. Worst case: the receiver gets the same list of alerts again and again. I consider that a fair trade off, given the low probability of it happening. (Mmmm, famous last words...)

Bottom line, we don't do locking in this routine.

Note that we add a small reminder to the message buffer about us processing the alert and sending a message. However, we don't set the number of messages to 1 because otherwise that would be the signal to sent this message the next time. We don't want sent a message every $cron_interval minutes basically saying that we didn't do anything since the previous run. (Or is this a feature after all?)

Failures are logged, success are logged as LOG_DEBUG.

  • return: the number of messages that were processed
int cron_send_queued_alerts ([int $max_messages = 10])
  • int $max_messages: do not send more than this number of messages
get_area_records (line 1117)

retrieve a list of all available area records keyed by area_id

this returns a list of area-records or FALSE if no areas are available The list is cached via a static variable so we don't have to go to the database more than once for this. Note that the returned array is keyed with area_id and is sorted by sort_order. Also note that this list may include areas for which the current user has no permissions whatsoever.

  • return: FALSE if no areas available or an array with area-records
array|bool get_area_records ([bool $forced = FALSE])
  • bool $forced: if TRUE forces reread from database (resets the cache)
get_parameter_int (line 491)

return an integer value specified in the page request or default value if none

  • return: the value of the parameter or the default value if not specified
mixed get_parameter_int (string $name, [mixed $default_value = NULL])
  • string $name: the name of the parameter to retrieve the value of
  • mixed $default_value: the value to return if parameter was not specified
get_parameter_string (line 506)

return an (unquoted) string value specified in the page request or default value if none

  • return: the value of the parameter or the default value if not specified
mixed get_parameter_string (string $name, [mixed $default_value = NULL])
  • string $name: the name of the parameter to retrieve the value of
  • mixed $default_value: the value to return if parameter was not specified
get_properties (line 116)

retrieve typed properties (name-value-pairs) from a table

this retrieves the fields 'name', 'value' and 'type' from all records from $tablename that satisfy the condition in $where. The values, which are stored as strings in the database, are converted to their proper value type and stored in the resulting array, keyed by name. The following types are recognised:

  • b = boolean
  • d = date ('yyyy-mm'dd', handled like a string)
  • dt = datetime ('yyyy-mm-dd hh:mm:ss', handled like a string)
  • f = float
  • i = integer
  • s = string
  • t = time ('hh:mm:ss', handled like a string)
Note that we currently do not validate these properties, the assumption is that the values are valid (or empty).

  • return: FALSE on error, or an array with name-value-pairs
bool|array get_properties ([string $tablename = 'config'], [array|string $where = ''])
  • string $tablename: the name of the table holding the properties
  • array|string $where: which records do we need to select
get_requested_area (line 442)

get the number of the area the user requested or null if not specified

See discussion of get_requested_node(). We use separate routine because we may want to support index.php/aaa/nnn/.... instead of index.php?area=aaa&node=nnn&...

  • return: integer indicating the area or null if none specified
int|null get_requested_area ()
get_requested_filename (line 474)

get the name of the requested file

See discussion of get_requested_node(). Files are served via /file.php via a comparable mechanism: either

http://localhost/file.php/path/to/filename.ext

OR

http://localhost/file.php?file=/path/to/filename.ext

This routine extracts the '/path/to/filename.ext' part.

  • return: requested filename or null if none specified
string|null get_requested_filename ()
get_requested_node (line 417)

get the number of the node the user requested or NULL if not specified

This routine exists because nodes and areas are so central to the whole idea of WAS.

Purpose is to retrieve any reqested node_id from the parameters submitted by the user. As a rule this works via name-value-pairs, something like this: index.php?area=aaa&node=nnn. However, if the webserver is configured correctly, we can also accept index.php/aaa/nnn/.... or index.php/nnn/.... which is more proxy-friendly. Using a generic routine like get_parameter_int() would not be sufficient in that case, so there.

Note that the same proxy-friendly 'trick' is used to determine the filename of a file that needs to be served via file.php (see get_requested_filename().

Note that we first look at the proxy-friendly variant. If that doesn't work, we resort to the conventional way of index.php?node=nnn. Also note that the order of the path_info is important. If there is just a single numeric path component, we assume that it is the node value; if there are two numerics our assumption is that the first one is the area id and the second one the node id.

Note that this routine does not validate the requested node in any way other than making sure that IF it is specified, it is an integer value. For all we know it might even be a negative value.

  • return: integer indicating the node or NULL if none specified
int|null get_requested_node ()
get_unique_number (line 1485)

a small utility routine that returns a unique integer

this generates a unique number (starting at 1). This number is guaranteed to be unique during this http-request (or at least until the static variable $id overflows, but that takes a while). If the optional parameter $increment is FALSE, the latest id returned is returned again.

  • return: a new unique value every time
int get_unique_number ([bool $increment = TRUE])
  • bool $increment: optional indicates whether the static counter must be incremented
get_user_groups (line 1191)

retrieve the records of the groups of which user $user_id is a member

  • return: 0, 1 or more acl_id => $group_record pairs
  • uses: $DB
array get_user_groups (int $user_id)
  • int $user_id: the user we're looking at
ini_get_int (line 1320)

return an integer (bytecount) value from PHP ini

  • return: value expressed in bytes
int ini_get_int (string $variable)
  • string $variable: name of the variable to retrieve, e.g. 'upload_max_filesize'
is_expired (line 1079)

determine if any of the ancestors or $node_id itself is already expired

This climbs the tree upward, starting at $node_id, to see if any nodes are expired. If an expired node is detected, TRUE is returned. If none of the nodes are expired, then FALSE is returned.

Note that this routine looks strictly at the expiry property, it is very well possible that a node is under embargo, see is_under_embargo().

Also note that this routine currently also tries to 'fix' the node database when a circular reference is detected. This doesn't really belong here, but for the time being it is convenient to have this auto-repair mechanism here. The node that is fixed is the section we are looking at after MAXIMUM_ITERATIONS tries, which is not necessarily the node we started with.

  • return: TRUE if any ancestor (or node_id) is expired, otherwise FALSE
  • todo: this function also 'repairs' circular references. This should move to a separate tree-repair function but for the time being it is "convenient" to have automatic repairs...
  • uses: $DB
bool is_expired (int $node_id, array &$tree)
  • int $node_id
  • array &$tree: family tree
is_under_embargo (line 1031)

determine if any of the ancestors or $node_id itself is under embargo

This climbs the tree upward, starting at $node_id, to see if any nodes are under embargo. If an embargo'ed node is detected, TRUE is returned. If none of the nodes are under embargo, then FALSE is returned.

Note that this routine looks strictly at the embargo property, it is very well possible that a node is expired, see is_expired().

Also note that this routine currently also tries to 'fix' the node database when a circular reference is detected. This doesn't really belong here, but for the time being it is convenient to have this auto-repair mechanism here. The node that is fixed is the section we are looking at after MAXIMUM_ITERATIONS tries, which is not necessarily the node we started with.

  • return: TRUE if any ancestor (or node_id) is under embargo, otherwise FALSE
  • todo: this function also 'repairs' circular references. This should move to a separate tree-repair function but for the time being it is "convenient" to have automatic repairs...
  • uses: $DB
bool is_under_embargo (array &$tree, int $node_id)
  • array &$tree: family tree
  • int $node_id
javascript_alert (line 298)

massage a message and generate a javascript alert()

  • return: javascript code with alert() function call with properly escaped message string
  • usedby: login_page_close()
string javascript_alert (string $message)
  • string $message: message to display
lock_record (line 655)

put a (co-operative) lock on a record

this tries to set the co-operative) lock on the record with serial (pkey) $id in table $tablename by setting the $locked_by field to our own session_id. This is the companion routine of lock_release().

The mechanism of co-operative locking works as follows. Some tables (such as the 'nodes' table) have an int field, e.g. 'locked_by_session_id'. This field can either be NULL (indicating that the record is not locked) or hold the primary key of a session (indicating that the record is locked and also by which session).

Obtaining a lock boils down to updating the table and setting that field to the session_id. As long as the underlying database system guarantees that execution of an UPDATE statement is not interrupted, we can use UPDATE as a 'Test-And-Set'-function. According to the docentation MySQL does this.

The procedure is as follows.

  1. we try to set the locked_by-field to our session_id on the condition that the previous value of that field is NULL. If this succeeds, we have effectively locked the record.
2. If this fails, we retrieve the current value of the field to see which session has locked it. If this happens to be us, we had already locked the record before and we're done.

3. If another session_id holds the lock, we check for that session's existence. If it still exists, we're out of luck: we can't obtain the lock.

4. If that other session does no longer exist, we try to replace that other session's session_id with our own session_id, once again using a single UPDATE (avoiding another race condition). If that succeeds we're done and we have the lock; if it failes we're also done but without lock.

If locking the record fails because the record is already locked by another session, this routine returns information about that other session in $lockinfo. It is up to the caller to use this information or not.

Note. A record can stay locked if the webbrowser of the locking session has crashed. Eventually this will be resolved if the crashed session is removed from the sessions table. However, the user may have restarted her browser while the record was locked. From the new session it appears that the record is still locked. This may take a while. Mmmmm... The other option is to lock on a per-user basis rather than per-session basis. Mmmm... Should we ask the user to override the session if it happens to be the same user? Mmm. put it on the todo list. (A small improvement might be to call the garbage collection between step 2 and 3. Oh well).

  • return: TRUE if locked succesfully, FALSE on error or already locked ; extra info returned in $lockinfo
  • todo: we need to resolve the problem of crashing browsers and locked records
  • todo: perhaps we can save 1 trip to the database by checking for something like UPDATE SET locked_by = $session_id WHERE (id = $id) AND ((locked_by IS NULL) OR (locked_by = $session_id)) but I don't know how many affected rows that would yield if we already had the lock and effectively nothing changes in the record. (Perhaps always update atime to force 1 affected row?)
  • todo: do we need a 'force lock' option to forcefully take over spurious locks?
  • usedby: lock_release_node()
  • usedby: lock_record_node()
bool lock_record (int $id, array &$lockinfo, string $tablename, string $pkey, string $locked_by, string $locked_since)
  • int $id: the primary key of the record to lock
  • array &$lockinfo: returns information about the session that already locked this record
  • string $tablename: the name of the table
  • string $pkey: name of the field holding the serial (pkey)
  • string $locked_by: name of the field to hold our session_id indicating we locked the record
  • string $locked_since: name of the field holding the datetime when the lock was obtained
lock_record_node (line 575)

get record lock on a node

this is a wrapper around lock_record() for locking nodes.

  • return: TRUE if locked succesfully, FALSE on error or already locked ; extra info returned in $lockinfo
  • uses: lock_record()
bool lock_record_node (int $node_id, array &$lockinfo)
  • int $node_id: the primary key of the node to lock
  • array &$lockinfo: returns information about the session that already locked this record
lock_release (line 744)

unlock a record that was previously successfully locked

this removes the co-operative) lock on the record with serial (pkey) $id in table $tablename by setting the $locked_by field to NULL. This is the companion routine of lock_record().

  • return: TRUE if locked removed succesfully, FALSE on error or lock not found
bool lock_release (int $id, string $tablename, string $pkey, string $locked_by, string $locked_since)
  • int $id: the primary key of the record to unlock
  • string $tablename: the name of the table
  • string $pkey: name of the field holding the serial (pkey)
  • string $locked_by: name of the field holding the session_id of the session that locked the record
  • string $locked_since: name of the field holding the datetime when the lock was obtained
lock_release_node (line 588)

release lock on a node

this is a wrapper around lock_release() for unlocking nodes.

  • return: TRUE if locked removed succesfully, FALSE on error or lock not found
  • uses: lock_record()
bool lock_release_node (int $node_id)
  • int $node_id: the primary key of the node record to unlock
logger (line 363)

a simple function to log information to the database 'for future reference'

This adds a message to the table log_messages, including a time, the remote address and (of course) a message. See also the standard PHP-function syslog(). We use the existing symbolic constants for priority. Default value is LOG_INFO.

Note that messages with a priority LOG_DEBUG are only written to the log if the global parameter $CFG->debug is TRUE. All other messages are simply logged, no further questions asked.

If the caller does not provide a user_id, this routine attempts to read the user_id from the global $_SESSION array, i.e. we try to link events to a particular user if possible.

Note that with a field definition of varchar(255) there is room to store either an IPv4 address (max 15 bytes) or a full-blown IPv6 address (max 39 bytes).

  • return: FALSE on error, TRUE on success
  • todo: should we make this configurable and maybe log directly to syslog (with automatic logrotate) or do we want to keep this 'self-contained' (the webmaster can read the table, but not the machine's syslog)?
  • usedby: login_send_bypass()
  • usedby: login_send_laissez_passer()
  • uses: $CFG
bool logger (string $message, [int $priority = LOG_INFO], [ $user_id = ''])
  • string $message: the message to write to the log
  • int $priority: loglevel, see PHP-function syslog() for a list of predefined constants
  • $user_id
magic_unquote (line 83)

this circumvents the 'magic' in magic_quotes_gpc() by conditionally stripping slashes

Magic quotes are a royal pain for portability. If magic quotes are enabled, this function reverses the effect. There are three PHP-parameters in php.ini affecting the magic:

  • the directive 'magic_quotes_runtime'
  • the directive 'magic_quotes_gpc'
  • the directive 'magic_quotes_sybase'
This routine deals with undoing the effect of the latter two. The effect of magic_quotes_runtime can be undone via set_magic_quotes_runtime(0). This is done once at program start (See initialise() in init.php).

This routine should be used to unquote strings from $_GET[], $_POST[] and $_COOKIE whenever they are needed.

Important note: because third party subsystems may deal with magic quotes on their own, it is a Bad Idea[tm] to globally replace the contents of $_GET[], $_POST[] and $_COOKIE with the unescaped values once at program start. Any subsystem would be confused if magic_quotes_gpc() indicates that the magic is in effect whereas in reality the magic was already undone at program start. Yes, this yields a performance penalty, but this magic was a mess right from the start. Hopefully PHP6 will get rid of this magic for once and for all...

  • return: the unescaped string
string magic_unquote (string $value)
  • string $value: a string value that is conditionally unescaped
performance_get_queries (line 547)

return the number of database queries that was executed

  • return: the number of queries
  • uses: $DB
int performance_get_queries ()
performance_get_seconds (line 559)

return the script execution time

  • return: interval between begin execution and now
  • todo: maybe we should get rid of this $PERFORMANCE object, because it doesn't do that much anyway
double performance_get_seconds ()
quasi_random_string (line 282)

generate a string with quasi-random characters

This generates a string of $length quasi-random characters. The optional parameter $candidates determines which characters are elegible. Popular choices for $candidates are:

  • 10 (minimum): use only digits from 0,...,9
  • 16: use digits 0,...9 or letters A,...F
  • 36 (default): use digits 0,...,9 or letters A,...,Z
  • 62: use digits 0,...,9 or letters A,...,Z or letters a,...,z
If $candidates is smaller than 10, 10 is used, if $candidates is greater than 62 62 is used.

Note that this is an ASCII-centric routine: we only use plain ASCII letters and digits and nothing of the 64000 other UNicode characters in the Basic Multilingual Plane. The reason is simple: 7-bit ASCII characters have the best chance of getting through communiocation channels unmangled so there.

void quasi_random_string (int $length, [int $candidates = 36])
  • int $length: length of the string to generate
  • int $candidates: number of candidate-characters to choose from
quoted_printable (line 1424)

convert string $s from native format to quoted printable (RFC2045)

this converts the input string $s to quoted printable form as defined in RFC2045 (see http://www.ietf.org/rfc/rfc2045.txt). By default this routine assumes a line-oriented text input. This can be overruled by calling the routine with the parameter $textmode set to FALSE: in that case the input is considered to be a binary string with no embedded newlines.

The routine assumes that the input lines are delimited with $newline. By default this parameter is a LF (Linefeed) but it could be changed to another delimiter using the function parameter $newline.

According to RFC2045 the resulting output lines should be no longer than 76 bytes, even though it is very well possible to use shorter lines. This can be done by setting the parameter $max_length to the desired value. Note that this value is forced to be in the range 4,...,76.

The encoding is defined in section 6.7 of RFC2045 with these five rules.

(1) General 8bit representation: any character may be represented as "=" followed by two uppercase hexadecimal digits.

(2) Literal representation characters "!" to "~" but excluding the "=" may represent themselves.

(3) White space Space " " and tab "\t" at the end of a line must use rule (1); in all other cases either rule (1) or (2) may be applied.

(4) Line breaks The (hard) line breaks in the input must be represented using "\r\n" in the output.

(5) Soft line breaks Output lines may not be longer than 76 bytes. This can be enforced by inserting a soft line break (the string "=\r\n") in the output. This soft line break will disappear once the encoded string is decoded.

The basic conversion algoritm is constructed using two important variables:

  • an integer value ($remaining) indicating the number of bytes left in the current output line
  • a boolean flag ($next_is_newline) indicating if the next input character is a $newline
The variable $remaining keeps track of situations where the current character (either as (1) General 8bit representation or (2) Literal representation) might not fit on the current line (eg. 2 bytes left requires an 8bit representation to be moved to the next output line). The flag $next_is_newline is used to make the best posible use of the available remaining space in the output, eg. if the current character is exactly as long as the remaining space, we can output that character on the current output line, because we are sure that it is the last character on the current output line so there cannot be a soft return next.

Note that spaces (ASCII 32) and tabs (ASCII 9) are treated differently depending on their position in the line. The rule is that both should be represented as "=20" or "=09" at the end of an input line and that it is allowed to use " " or "\t" when NOT at the end of an input line. In the latter case, the output line will allways end with a soft line break "=\r\n" which makes sure that there are not trailing spaces/tabs in the output line anyway.

Also note that the end of the input $s is also flagged via setting $next_is_newline. This is an optimalisation which treats spaces and tabs at the end of the input as if they were at the end of an input line, ie. converting to "=20" or "=09". This means that the output will never end with a space of a tab, even if the input does.

Note that in case of a binary conversion the input character(s) that might otherwise indicate a newline are to be considered as binary data. However, if the data is completely binary, it probably doesn't make sense to use Quoted-Printable in the first place (base64 would probably be a better choice).

Reference: see http://www.ietf.org/rfc/rfc2045.txt.

  • return: encoded string according to RFC2045
  • todo: should we change the code to accomodate the canonical newline CRLF in the input?
string quoted_printable (string $s, [bool $textmode = TRUE], [string $newline = &quot;\n&quot;], [int $max_length = 76])
  • string $s: source string
  • bool $textmode: TRUE means newlines count as hard line breaks, FALSE is binary data
  • string $newline: native character indicating end of line
  • int $max_length: indicates the limit for output lines (excluding the CRLF)
redirect_and_exit (line 520)

redirect to another url by sending an http header

nothing redirect_and_exit (string $url, [ $message = ''])
  • string $url: the url to redirect to
  • $message
replace_crlf (line 315)

unfold a possible multiline string

This removes all linefeeds and carriage returns from a string Typical use would be to strip a subject line in a mailmessage from newlines which might interfere with proper sending of mail headers.

  • return: the string with offending characters replaced
string replace_crlf (string $multiline_string, [string $replacement = ''])
  • string $multiline_string: the multiline string to strip
  • string $replacement: (optional) the string to replace newlines
sanitise_filename (line 1289)

sanitise a string to make it acceptable as a filename/directoryname

this routine analyses and maybe converts the input string as follows:

  • all leading and trailing dots, spaces, dashes, underscores, backslashes and slashes are removed
  • all embedded spaces, backslashes and slashes are converted to underscores
  • only letters, digits, dots, dashes or underscores are retained
  • all sequences of 2 or more underscores are replaced with a single underscore
  • finally all 'forbidden' words (including empty string) get an underscore prefixed
Note that this sanitising only satisfies the basic rules for filenames; creating a new file with a sanitised name may still clash with an existing file or subdirectory.

Also note that a full pathname will yield something that looks like a simple filename without directories or drive letter: C:\Program Files\Apache Group\htpasswd becomes C_Program_Files_Apache_Group_htpasswd and /etc/passwd becomes etc_passwd. Also this routine makes a URL look like a filename: http://www.example.com becomes http_www.example.com.

Finally note that we don't even attempt to transliterate utf8-characters or any other characters between 128 and 255; these are simply removed.

  • return: sanitised filename which is never empty
  • todo: should we check for overlong UTF-8 encodings: C0 AF C0 AE C0 AE C0 AF equates to /../ or is that dealt with already by filtering on letters/digits and embedded dots/dashes/underscores?
string sanitise_filename (string $filename)
  • string $filename: the string to sanitise
string2time (line 225)

convert a string representation of a date/time to a timestamp

this is a crude date/time parser. We collect digits and convert to integers. With the integers we fill an array with at least 6 integers, corresponding to year, month, day, hours, minutes and seconds. If there are less than six numbers in the source string the value 0 is used. for the remaining elements. Note that a number in this context is always a non-negative number because a dash (or minus) is considered a delimiter.

Note that valid date/time values are limited to how many seconds can be represented in a signed long integer, where 0 equates to 1970-01-01 00:00:00 (the Unix epoch). The upper limit for a 32-bit int is some date in 2038 (only 30 years from now).

  • return: unix timestamp (second since epoch) or FALSE on error
bool|long string2time (string $timestring)
  • string $timestring: date/time in the form yyyy-mm-dd hh:mm:ss
t (line 332)

translation of phrases via a function with a very short name

This is only a wrapper function for $LANGUAGE->get_phrase()

  • return: translated string with optional values from array 'replace' inserted
  • uses: $LANGUAGE
string t (string $phrase_key, [string $full_domain = ''], [array $replace = ''], [string $location_hint = ''], [string $language = ''])
  • string $phrase_key: indicates the phrase that needs to be translated
  • string $full_domain: (optional) indicates the text domain (perhaps with a prefix)
  • array $replace: (optional) an assoc array with key-value-pairs to insert into the translation
  • string $location_hint: (optional) hints at a directory location of language files
  • string $language: (optional) target language

Documentation generated on Wed, 11 May 2011 23:45:46 +0200 by phpDocumentor 1.4.0