Fix xslt_process() to ensure that it inserts a NULL terminator after the
[PostgreSQL.git] / doc / src / sgml / datetime.sgml
blob42456a37bf49ef270871ad1b4a2b7b96ad0a38c7
1 <!-- $PostgreSQL$ -->
3 <appendix id="datetime-appendix">
4 <title>Date/Time Support</title>
6 <para>
7 <productname>PostgreSQL</productname> uses an internal heuristic
8 parser for all date/time input support. Dates and times are input as
9 strings, and are broken up into distinct fields with a preliminary
10 determination of what kind of information can be in the
11 field. Each field is interpreted and either assigned a numeric
12 value, ignored, or rejected.
13 The parser contains internal lookup tables for all textual fields,
14 including months, days of the week, and time zones.
15 </para>
17 <para>
18 This appendix includes information on the content of these
19 lookup tables and describes the steps used by the parser to decode
20 dates and times.
21 </para>
23 <sect1 id="datetime-input-rules">
24 <title>Date/Time Input Interpretation</title>
26 <para>
27 The date/time type inputs are all decoded using the following procedure.
28 </para>
30 <procedure>
31 <step>
32 <para>
33 Break the input string into tokens and categorize each token as
34 a string, time, time zone, or number.
35 </para>
37 <substeps>
38 <step>
39 <para>
40 If the numeric token contains a colon (<literal>:</>), this is
41 a time string. Include all subsequent digits and colons.
42 </para>
43 </step>
45 <step>
46 <para>
47 If the numeric token contains a dash (<literal>-</>), slash
48 (<literal>/</>), or two or more dots (<literal>.</>), this is
49 a date string which might have a text month. If a date token has
50 already been seen, it is instead interpreted as a time zone
51 name (e.g., <literal>America/New_York</>).
52 </para>
53 </step>
55 <step>
56 <para>
57 If the token is numeric only, then it is either a single field
58 or an ISO 8601 concatenated date (e.g.,
59 <literal>19990113</literal> for January 13, 1999) or time
60 (e.g., <literal>141516</literal> for 14:15:16).
61 </para>
62 </step>
64 <step>
65 <para>
66 If the token starts with a plus (<literal>+</>) or minus
67 (<literal>-</>), then it is either a numeric time zone or a special
68 field.
69 </para>
70 </step>
71 </substeps>
72 </step>
74 <step>
75 <para>
76 If the token is a text string, match up with possible strings:
77 </para>
79 <substeps>
80 <step>
81 <para>
82 Do a binary-search table lookup for the token as a time zone
83 abbreviation.
84 </para>
85 </step>
87 <step>
88 <para>
89 If not found, do a similar binary-search table lookup to match
90 the token as either a special string (e.g., <literal>today</literal>),
91 day (e.g., <literal>Thursday</literal>),
92 month (e.g., <literal>January</literal>),
93 or noise word (e.g., <literal>at</literal>, <literal>on</literal>).
94 </para>
95 </step>
97 <step>
98 <para>
99 If still not found, throw an error.
100 </para>
101 </step>
102 </substeps>
103 </step>
105 <step>
106 <para>
107 When the token is a number or number field:
108 </para>
110 <substeps>
111 <step>
112 <para>
113 If there are eight or six digits,
114 and if no other date fields have been previously read, then interpret
115 as a <quote>concatenated date</quote> (e.g.,
116 <literal>19990118</literal> or <literal>990118</literal>).
117 The interpretation is <literal>YYYYMMDD</> or <literal>YYMMDD</>.
118 </para>
119 </step>
121 <step>
122 <para>
123 If the token is three digits
124 and a year has already been read, then interpret as day of year.
125 </para>
126 </step>
128 <step>
129 <para>
130 If four or six digits and a year has already been read, then
131 interpret as a time (<literal>HHMM</> or <literal>HHMMSS</>).
132 </para>
133 </step>
135 <step>
136 <para>
137 If three or more digits and no date fields have yet been found,
138 interpret as a year (this forces yy-mm-dd ordering of the remaining
139 date fields).
140 </para>
141 </step>
143 <step>
144 <para>
145 Otherwise the date field ordering is assumed to follow the
146 <varname>DateStyle</> setting: mm-dd-yy, dd-mm-yy, or yy-mm-dd.
147 Throw an error if a month or day field is found to be out of range.
148 </para>
149 </step>
150 </substeps>
151 </step>
153 <step>
154 <para>
155 If BC has been specified, negate the year and add one for
156 internal storage. (There is no year zero in the Gregorian
157 calendar, so numerically 1 BC becomes year zero.)
158 </para>
159 </step>
161 <step>
162 <para>
163 If BC was not specified, and if the year field was two digits in length,
164 then adjust the year to four digits. If the field is less than 70, then
165 add 2000, otherwise add 1900.
167 <tip>
168 <para>
169 Gregorian years AD 1-99 can be entered by using 4 digits with leading
170 zeros (e.g., <literal>0099</> is AD 99).
171 </para>
172 </tip>
173 </para>
174 </step>
175 </procedure>
176 </sect1>
179 <sect1 id="datetime-keywords">
180 <title>Date/Time Key Words</title>
182 <para>
183 <xref linkend="datetime-month-table"> shows the tokens that are
184 recognized as names of months.
185 </para>
187 <table id="datetime-month-table">
188 <title>Month Names</title>
189 <tgroup cols="2">
190 <thead>
191 <row>
192 <entry>Month</entry>
193 <entry>Abbreviations</entry>
194 </row>
195 </thead>
196 <tbody>
197 <row>
198 <entry>January</entry>
199 <entry>Jan</entry>
200 </row>
201 <row>
202 <entry>February</entry>
203 <entry>Feb</entry>
204 </row>
205 <row>
206 <entry>March</entry>
207 <entry>Mar</entry>
208 </row>
209 <row>
210 <entry>April</entry>
211 <entry>Apr</entry>
212 </row>
213 <row>
214 <entry>May</entry>
215 <entry></entry>
216 </row>
217 <row>
218 <entry>June</entry>
219 <entry>Jun</entry>
220 </row>
221 <row>
222 <entry>July</entry>
223 <entry>Jul</entry>
224 </row>
225 <row>
226 <entry>August</entry>
227 <entry>Aug</entry>
228 </row>
229 <row>
230 <entry>September</entry>
231 <entry>Sep, Sept</entry>
232 </row>
233 <row>
234 <entry>October</entry>
235 <entry>Oct</entry>
236 </row>
237 <row>
238 <entry>November</entry>
239 <entry>Nov</entry>
240 </row>
241 <row>
242 <entry>December</entry>
243 <entry>Dec</entry>
244 </row>
245 </tbody>
246 </tgroup>
247 </table>
249 <para>
250 <xref linkend="datetime-dow-table"> shows the tokens that are
251 recognized as names of days of the week.
252 </para>
254 <table id="datetime-dow-table">
255 <title>Day of the Week Names</title>
256 <tgroup cols="2">
257 <thead>
258 <row>
259 <entry>Day</entry>
260 <entry>Abbreviations</entry>
261 </row>
262 </thead>
263 <tbody>
264 <row>
265 <entry>Sunday</entry>
266 <entry>Sun</entry>
267 </row>
268 <row>
269 <entry>Monday</entry>
270 <entry>Mon</entry>
271 </row>
272 <row>
273 <entry>Tuesday</entry>
274 <entry>Tue, Tues</entry>
275 </row>
276 <row>
277 <entry>Wednesday</entry>
278 <entry>Wed, Weds</entry>
279 </row>
280 <row>
281 <entry>Thursday</entry>
282 <entry>Thu, Thur, Thurs</entry>
283 </row>
284 <row>
285 <entry>Friday</entry>
286 <entry>Fri</entry>
287 </row>
288 <row>
289 <entry>Saturday</entry>
290 <entry>Sat</entry>
291 </row>
292 </tbody>
293 </tgroup>
294 </table>
296 <para>
297 <xref linkend="datetime-mod-table"> shows the tokens that serve
298 various modifier purposes.
299 </para>
301 <table id="datetime-mod-table">
302 <title>Date/Time Field Modifiers</title>
303 <tgroup cols="2">
304 <thead>
305 <row>
306 <entry>Identifier</entry>
307 <entry>Description</entry>
308 </row>
309 </thead>
310 <tbody>
311 <row>
312 <entry><literal>AM</literal></entry>
313 <entry>Time is before 12:00</entry>
314 </row>
315 <row>
316 <entry><literal>AT</literal></entry>
317 <entry>Ignored</entry>
318 </row>
319 <row>
320 <entry><literal>JULIAN</>, <literal>JD</>, <literal>J</></entry>
321 <entry>Next field is Julian Day</entry>
322 </row>
323 <row>
324 <entry><literal>ON</literal></entry>
325 <entry>Ignored</entry>
326 </row>
327 <row>
328 <entry><literal>PM</literal></entry>
329 <entry>Time is on or after 12:00</entry>
330 </row>
331 <row>
332 <entry><literal>T</literal></entry>
333 <entry>Next field is time</entry>
334 </row>
335 </tbody>
336 </tgroup>
337 </table>
338 </sect1>
340 <sect1 id="datetime-config-files">
341 <title>Date/Time Configuration Files</title>
343 <indexterm>
344 <primary>time zone</primary>
345 <secondary>input abbreviations</secondary>
346 </indexterm>
348 <para>
349 Since timezone abbreviations are not well standardized,
350 <productname>PostgreSQL</productname> provides a means to customize
351 the set of abbreviations accepted by the server. The
352 <xref linkend="guc-timezone-abbreviations"> run-time parameter
353 determines the active set of abbreviations. While this parameter
354 can be altered by any database user, the possible values for it
355 are under the control of the database administrator &mdash; they
356 are in fact names of configuration files stored in
357 <filename>.../share/timezonesets/</> of the installation directory.
358 By adding or altering files in that directory, the administrator
359 can set local policy for timezone abbreviations.
360 </para>
362 <para>
363 <literal>timezone_abbreviations</> can be set to any file name
364 found in <filename>.../share/timezonesets/</>, if the file's name
365 is entirely alphabetic. (The prohibition against non-alphabetic
366 characters in <literal>timezone_abbreviations</> prevents reading
367 files outside the intended directory, as well as reading editor
368 backup files and other extraneous files.)
369 </para>
371 <para>
372 A timezone abbreviation file can contain blank lines and comments
373 beginning with <literal>#</>. Non-comment lines must have one of
374 these formats:
376 <synopsis>
377 <replaceable>time_zone_name</replaceable> <replaceable>offset</replaceable>
378 <replaceable>time_zone_name</replaceable> <replaceable>offset</replaceable> D
379 @INCLUDE <replaceable>file_name</replaceable>
380 @OVERRIDE
381 </synopsis>
382 </para>
384 <para>
385 A <replaceable>time_zone_name</replaceable> is just the abbreviation
386 being defined. The <replaceable>offset</replaceable> is the zone's
387 offset in seconds from UTC, positive being east from Greenwich and
388 negative being west. For example, -18000 would be five hours west
389 of Greenwich, or North American east coast standard time. <literal>D</>
390 indicates that the zone name represents local daylight-savings time
391 rather than standard time. Since all known time zone offsets are on
392 15 minute boundaries, the number of seconds has to be a multiple of 900.
393 </para>
395 <para>
396 The <literal>@INCLUDE</> syntax allows inclusion of another file in the
397 <filename>.../share/timezonesets/</> directory. Inclusion can be nested,
398 to a limited depth.
399 </para>
401 <para>
402 The <literal>@OVERRIDE</> syntax indicates that subsequent entries in the
403 file can override previous entries (i.e., entries obtained from included
404 files). Without this, conflicting definitions of the same timezone
405 abbreviation are considered an error.
406 </para>
408 <para>
409 In an unmodified installation, the file <filename>Default</> contains
410 all the non-conflicting time zone abbreviations for most of the world.
411 Additional files <filename>Australia</> and <filename>India</> are
412 provided for those regions: these files first include the
413 <literal>Default</> file and then add or modify timezones as needed.
414 </para>
416 <para>
417 For reference purposes, a standard installation also contains files
418 <filename>Africa.txt</>, <filename>America.txt</>, etc, containing
419 information about every time zone abbreviation known to be in use
420 according to the <literal>zoneinfo</> timezone database. The zone name
421 definitions found in these files can be copied and pasted into a custom
422 configuration file as needed. Note that these files cannot be directly
423 referenced as <literal>timezone_abbreviations</> settings, because of
424 the dot embedded in their names.
425 </para>
427 <note>
428 <para>
429 If an error occurs while reading the time zone data sets, no new value is
430 applied but the old set is kept. If the error occurs while starting the
431 database, startup fails.
432 </para>
433 </note>
435 <caution>
436 <para>
437 Time zone abbreviations defined in the configuration file override
438 non-timezone meanings built into <productname>PostgreSQL</productname>.
439 For example, the <filename>Australia</> configuration file defines
440 <literal>SAT</> (for South Australian Standard Time). When this
441 file is active, <literal>SAT</> will not be recognized as an abbreviation
442 for Saturday.
443 </para>
444 </caution>
446 <caution>
447 <para>
448 If you modify files in <filename>.../share/timezonesets/</>,
449 it is up to you to make backups &mdash; a normal database dump
450 will not include this directory.
451 </para>
452 </caution>
454 </sect1>
456 <sect1 id="datetime-units-history">
457 <title>History of Units</title>
459 <para>
460 The Julian calendar was introduced by Julius Caesar in 45 BC.
461 It was in common use in the Western world
462 until the year 1582, when countries started changing to the Gregorian
463 calendar. In the Julian calendar, the tropical year is
464 approximated as 365 1/4 days = 365.25 days. This gives an error of
465 about 1 day in 128 years.
466 </para>
468 <para>
469 The accumulating calendar error prompted
470 Pope Gregory XIII to reform the calendar in accordance with
471 instructions from the Council of Trent.
472 In the Gregorian calendar, the tropical year is approximated as
473 365 + 97 / 400 days = 365.2425 days. Thus it takes approximately 3300
474 years for the tropical year to shift one day with respect to the
475 Gregorian calendar.
476 </para>
478 <para>
479 The approximation 365+97/400 is achieved by having 97 leap years
480 every 400 years, using the following rules:
482 <simplelist>
483 <member>
484 Every year divisible by 4 is a leap year.
485 </member>
486 <member>
487 However, every year divisible by 100 is not a leap year.
488 </member>
489 <member>
490 However, every year divisible by 400 is a leap year after all.
491 </member>
492 </simplelist>
494 So, 1700, 1800, 1900, 2100, and 2200 are not leap years. But 1600,
495 2000, and 2400 are leap years.
497 By contrast, in the older Julian calendar all years divisible by 4 are leap
498 years.
499 </para>
501 <para>
502 The papal bull of February 1582 decreed that 10 days should be dropped
503 from October 1582 so that 15 October should follow immediately after
504 4 October.
505 This was observed in Italy, Poland, Portugal, and Spain. Other Catholic
506 countries followed shortly after, but Protestant countries were
507 reluctant to change, and the Greek Orthodox countries didn't change
508 until the start of the 20th century.
510 The reform was observed by Great Britain and Dominions (including what is
511 now the USA) in 1752.
512 Thus 2 September 1752 was followed by 14 September 1752.
514 This is why Unix systems have the <command>cal</command> program
515 produce the following:
517 <screen>
518 $ <userinput>cal 9 1752</userinput>
519 September 1752
520 S M Tu W Th F S
521 1 2 14 15 16
522 17 18 19 20 21 22 23
523 24 25 26 27 28 29 30
524 </screen>
525 </para>
527 <para>
528 The SQL standard states that <quote>Within the definition of a
529 <quote>datetime literal</quote>, the <quote>datetime
530 value</quote>s are constrained by the natural rules for dates and
531 times according to the Gregorian calendar</quote>. Dates between
532 1582-10-05 and 1582-10-14, although eliminated in some countries
533 by Papal fiat, conform to <quote>natural rules</quote> and are
534 hence valid dates. <productname>PostgreSQL</> follows the SQL
535 standard's lead by counting dates exclusively in the Gregorian
536 calendar, even for years before that calendar was in use.
537 </para>
539 <para>
540 Different calendars have been developed in various parts of the
541 world, many predating the Gregorian system.
543 For example,
544 the beginnings of the Chinese calendar can be traced back to the 14th
545 century BC. Legend has it that the Emperor Huangdi invented that
546 calendar in 2637 BC.
548 The People's Republic of China uses the Gregorian calendar
549 for civil purposes. The Chinese calendar is used for determining
550 festivals.
551 </para>
553 <para>
554 The <quote>Julian Date</quote> is unrelated to the <quote>Julian
555 calendar</quote>.
556 The Julian Date system was invented by the French scholar
557 Joseph Justus Scaliger (1540-1609)
558 and probably takes its name from Scaliger's father,
559 the Italian scholar Julius Caesar Scaliger (1484-1558).
560 In the Julian Date system, each day has a sequential number, starting
561 from JD 0 (which is sometimes called <emphasis>the</> Julian Date).
562 JD 0 corresponds to 1 January 4713 BC in the Julian calendar, or
563 24 November 4714 BC in the Gregorian calendar. Julian Date counting
564 is most often used by astronomers for labeling their nightly observations,
565 and therefore a date runs from noon UTC to the next noon UTC, rather than
566 from midnight to midnight: JD 0 designates the 24 hours from noon UTC on
567 1 January 4713 BC to noon UTC on 2 January 4713 BC.
568 </para>
570 <para>
571 Although <productname>PostgreSQL</> supports Julian Date notation for
572 input and output of dates (and also uses them for some internal datetime
573 calculations), it does not observe the nicety of having dates run from
574 noon to noon. <productname>PostgreSQL</> treats a Julian Date as running
575 from midnight to midnight.
576 </para>
578 </sect1>
579 </appendix>