doc/info/descriptive.texi

   1 @menu
   2 * Introduction to descriptive::
   3 * Functions and Variables for data manipulation::
   4 * Functions and Variables for descriptive statistics::
   5 * Functions and Variables for statistical graphs::
   6 @end menu
   7
   8 @node Introduction to descriptive, Functions and Variables for data manipulation, descriptive-pkg, descriptive-pkg
   9 @section Introduction to descriptive
  10
  11 Package @code{descriptive} contains a set of functions for
  12 making descriptive statistical computations and graphing.
  13 Together with the source code there are three data sets in
  14 your Maxima tree: @code{pidigits.data}, @code{wind.data} and @code{biomed.data}.
  15
  16 Any statistics manual can be used as a reference to the functions in package @code{descriptive}.
  17
  18 For comments, bugs or suggestions, please contact me at @var{'riotorto AT yahoo DOT com'}.
  19
  20 Here is a simple example on how the descriptive functions in @code{descriptive} do they work, depending on the nature of their arguments, lists or matrices,
  21
  22 @c ===beg===
  23 @c load ("descriptive")$
  24 @c /* univariate sample */   mean ([a, b, c]);
  25 @c matrix ([a, b], [c, d], [e, f]);
  26 @c /* multivariate sample */ mean (%);
  27 @c ===end===
  28 @example
  29 (%i1) load ("descriptive")$
  30 @group
  31 (%i2) /* univariate sample */   mean ([a, b, c]);
  32                             c + b + a
  33 (%o2)                       ---------
  34                                 3
  35 @end group
  36 @group
  37 (%i3) matrix ([a, b], [c, d], [e, f]);
  38                             [ a  b ]
  39                             [      ]
  40 (%o3)                       [ c  d ]
  41                             [      ]
  42                             [ e  f ]
  43 @end group
  44 @group
  45 (%i4) /* multivariate sample */ mean (%);
  46                       e + c + a  f + d + b
  47 (%o4)                [---------, ---------]
  48                           3          3
  49 @end group
  50 @end example
  51
  52 Note that in multivariate samples the mean is calculated for each column.
  53
  54 In case of several samples with possible different sizes, the Maxima function @code{map} can be used to get the desired results for each sample,
  55
  56 @c ===beg===
  57 @c load ("descriptive")$
  58 @c map (mean, [[a, b, c], [d, e]]);
  59 @c ===end===
  60 @example
  61 (%i1) load ("descriptive")$
  62 @group
  63 (%i2) map (mean, [[a, b, c], [d, e]]);
  64                         c + b + a  e + d
  65 (%o2)                  [---------, -----]
  66                             3        2
  67 @end group
  68 @end example
  69
  70 In this case, two samples of sizes 3 and 2 were stored into a list.
  71
  72 Univariate samples must be stored in lists like
  73
  74 @c ===beg===
  75 @c s1 : [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5];
  76 @c ===end===
  77 @example
  78 @group
  79 (%i1) s1 : [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5];
  80 (%o1)           [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]
  81 @end group
  82 @end example
  83
  84 and multivariate samples in matrices as in
  85
  86 @c ===beg===
  87 @c s2 : matrix ([13.17, 9.29], [14.71, 16.88], [18.50, 16.88],
  88 @c              [10.58, 6.63], [13.33, 13.25], [13.21,  8.12]);
  89 @c ===end===
  90 @example
  91 @group
  92 (%i1) s2 : matrix ([13.17, 9.29], [14.71, 16.88], [18.50, 16.88],
  93              [10.58, 6.63], [13.33, 13.25], [13.21,  8.12]);
  94                         [ 13.17  9.29  ]
  95                         [              ]
  96                         [ 14.71  16.88 ]
  97                         [              ]
  98                         [ 18.5   16.88 ]
  99 (%o1)                   [              ]
 100                         [ 10.58  6.63  ]
 101                         [              ]
 102                         [ 13.33  13.25 ]
 103                         [              ]
 104                         [ 13.21  8.12  ]
 105 @end group
 106 @end example
 107
 108 In this case, the number of columns equals the random variable dimension and the number of rows is the sample size.
 109
 110 Data can be introduced by hand, but big samples are usually stored in plain text files. For example, file @code{pidigits.data} contains the first 100 digits of number @code{%pi}:
 111 @example
 112 @group
 113       3
 114       1
 115       4
 116       1
 117       5
 118       9
 119       2
 120       6
 121       5
 122       3 ...
 123 @end group
 124 @end example
 125
 126 In order to load these digits in Maxima,
 127
 128 @c ===beg===
 129 @c s1 : read_list (file_search ("pidigits.data"))$
 130 @c length (s1);
 131 @c ===end===
 132 @example
 133 (%i1) s1 : read_list (file_search ("pidigits.data"))$
 134 @group
 135 (%i2) length (s1);
 136 (%o2)                          100
 137 @end group
 138 @end example
 139
 140 On the other hand, file @code{wind.data} contains daily average wind speeds at 5 meteorological stations in the Republic of Ireland (This is part of a data set taken at 12 meteorological stations. The original file is freely downloadable from the StatLib Data Repository and its analysis is discussed in Haslett, J., Raftery, A. E. (1989) @var{Space-time Modelling with Long-memory Dependence: Assessing Ireland's Wind Power Resource, with Discussion}. Applied Statistics 38, 1-50). This loads the data:
 141
 142 @c ===beg===
 143 @c s2 : read_matrix (file_search ("wind.data"))$
 144 @c length (s2);
 145 @c s2 [%]; /* last record */
 146 @c ===end===
 147 @example
 148 (%i1) s2 : read_matrix (file_search ("wind.data"))$
 149 @group
 150 (%i2) length (s2);
 151 (%o2)                          100
 152 @end group
 153 @group
 154 (%i3) s2 [%]; /* last record */
 155 (%o3)            [3.58, 6.0, 4.58, 7.62, 11.25]
 156 @end group
 157 @end example
 158
 159 Some samples contain non numeric data. As an example, file @code{biomed.data} (which is part of another bigger one downloaded from the StatLib Data Repository) contains four blood measures taken from two groups of patients, @code{A} and @code{B}, of different ages,
 160
 161 @c ===beg===
 162 @c s3 : read_matrix (file_search ("biomed.data"))$
 163 @c length (s3);
 164 @c s3 [1]; /* first record */
 165 @c ===end===
 166 @example
 167 (%i1) s3 : read_matrix (file_search ("biomed.data"))$
 168 @group
 169 (%i2) length (s3);
 170 (%o2)                          100
 171 @end group
 172 @group
 173 (%i3) s3 [1]; /* first record */
 174 (%o3)            [A, 30, 167.0, 89.0, 25.6, 364]
 175 @end group
 176 @end example
 177
 178 The first individual belongs to group @code{A}, is 30 years old and his/her blood measures were 167.0, 89.0, 25.6 and 364.
 179
 180 One must take care when working with categorical data. In the next example, symbol @code{a} is assigned a value in some previous moment and then a sample with categorical value @code{a} is taken,
 181
 182 @c ===beg===
 183 @c a : 1$
 184 @c matrix ([a, 3], [b, 5]);
 185 @c ===end===
 186 @example
 187 (%i1) a : 1$
 188 @group
 189 (%i2) matrix ([a, 3], [b, 5]);
 190                             [ 1  3 ]
 191 (%o2)                       [      ]
 192                             [ b  5 ]
 193 @end group
 194 @end example
 195
 196 @opencatbox{Categories:}
 197 @category{Descriptive statistics}
 198 @category{Share packages}
 199 @category{Package descriptive}
 200 @closecatbox
 201
 202 @node Functions and Variables for data manipulation, Functions and Variables for descriptive statistics, Introduction to descriptive, descriptive-pkg
 203 @section Functions and Variables for data manipulation
 204
 205
 206
 207 @anchor{build_sample}
 208 @deffn {Function} build_sample @
 209 @fname{build_sample} (@var{list}) @
 210 @fname{build_sample} (@var{matrix})
 211
 212 Builds a sample from a table of absolute frequencies.
 213 The input table can be a matrix or a list of lists, all of
 214 them of equal size. The number of columns or the length of
 215 the lists must be greater than 1. The last element of each
 216 row or list is interpreted as the absolute frequency.
 217 The output is always a sample in matrix form.
 218
 219 Examples:
 220
 221 Univariate frequency table.
 222
 223 @c ===beg===
 224 @c load ("descriptive")$
 225 @c sam1: build_sample([[6,1], [j,2], [2,1]]);
 226 @c mean(sam1);
 227 @c barsplot(sam1) $
 228 @c ===end===
 229 @example
 230 (%i1) load ("descriptive")$
 231 @group
 232 (%i2) sam1: build_sample([[6,1], [j,2], [2,1]]);
 233                               [ 6 ]
 234                               [   ]
 235                               [ j ]
 236 (%o2)                         [   ]
 237                               [ j ]
 238                               [   ]
 239                               [ 2 ]
 240 @end group
 241 @group
 242 (%i3) mean(sam1);
 243                               j + 4
 244 (%o3)                        [-----]
 245                                 2
 246 @end group
 247 (%i4) barsplot(sam1) $
 248 @end example
 249
 250 Multivariate frequency table.
 251
 252 @c ===beg===
 253 @c load ("descriptive")$
 254 @c sam2: build_sample([[6,3,1], [5,6,2], [u,2,1],[6,8,2]]) ;
 255 @c cov(sam2);
 256 @c barsplot(sam2, grouping=stacked) $
 257 @c ===end===
 258 @example
 259 (%i1) load ("descriptive")$
 260 @group
 261 (%i2) sam2: build_sample([[6,3,1], [5,6,2], [u,2,1],[6,8,2]]) ;
 262                             [ 6  3 ]
 263                             [      ]
 264                             [ 5  6 ]
 265                             [      ]
 266                             [ 5  6 ]
 267 (%o2)                       [      ]
 268                             [ u  2 ]
 269                             [      ]
 270                             [ 6  8 ]
 271                             [      ]
 272                             [ 6  8 ]
 273 @end group
 274 @group
 275 (%i3) cov(sam2);
 276       [   2                 2                            ]
 277       [  u  + 158   (u + 28)     2 u + 174   11 (u + 28) ]
 278       [  -------- - ---------    --------- - ----------- ]
 279 (%o3) [     6          36            6           12      ]
 280       [                                                  ]
 281       [ 2 u + 174   11 (u + 28)            21            ]
 282       [ --------- - -----------            --            ]
 283       [     6           12                 4             ]
 284 @end group
 285 (%i4) barsplot(sam2, grouping=stacked) $
 286 @end example
 287
 288 @opencatbox{Categories:}
 289 @category{Package descriptive}
 290 @closecatbox
 291 @end deffn
 292
 293
 294
 295 @anchor{continuous_freq}
 296 @deffn {Function} continuous_freq @
 297 @fname{continuous_freq} (@var{data}) @
 298 @fname{continuous_freq} (@var{data}, @var{m})
 299
 300 Divides the range of @var{data} into intervals,
 301 and counts how many values fall into each one.
 302
 303 A value @var{x} falls into an interval with left and right endpoints @var{a} and @var{b}
 304 if and only if @code{@var{x} > @var{a}} and @code{@var{x} <= @var{b}},
 305 except for the first (least or leftmost) interval,
 306 for which @code{@var{x} >= @var{a}} and @code{@var{x} <= @var{b}}.
 307 That is, an interval excludes its left endpoint and includes its right endpoint,
 308 except for the first interval, which includes both the left and right endpoints.
 309
 310 @var{data} must be a list of numbers,
 311 or 1-dimensional array (as created by @code{make_array}).
 312
 313 @var{m} is optional, and equals either the number of classes (10 by default),
 314 or a list of two elements (the least and greatest values to be counted),
 315 or a list of three elements (the least and greatest values to be counted, and the number of classes),
 316 or a set containing the endpoints of the class intervals.
 317
 318 It is assumed that class intervals are contiguous.
 319 That is, the right endpoint of one interval is equal to the left endpoint of the next.
 320
 321 @code{continuous_freq} returns a list of two lists.
 322 The first list comprises all the endpoints of the class intervals,
 323 concatenated into a single list.
 324 The second list contains the class counts for the intervals corresponding to elements of the first list.
 325
 326 If sample values are all equal, this function returns exactly
 327 one class of width 2.
 328
 329 Examples:
 330
 331 Optional argument indicates the number of classes we want.
 332 The first list in the output contains the interval limits, and
 333 the second the corresponding counts: there are 16 digits inside
 334 the interval @code{[0, 1.8]}, 24 digits in @code{(1.8, 3.6]}, and so on.
 335
 336 @c ===beg===
 337 @c load ("descriptive")$
 338 @c s1 : read_list (file_search ("pidigits.data"))$
 339 @c continuous_freq (s1, 5);
 340 @c ===end===
 341 @example
 342 (%i1) load ("descriptive")$
 343 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 344 @group
 345 (%i3) continuous_freq (s1, 5);
 346                9  18  27  36
 347 (%o3)     [[0, -, --, --, --, 9], [16, 24, 18, 17, 25]]
 348                5  5   5   5
 349 @end group
 350 @end example
 351
 352 Optional argument indicates we want 7 classes with limits
 353 -2 and 12:
 354
 355 @c ===beg===
 356 @c load ("descriptive")$
 357 @c s1 : read_list (file_search ("pidigits.data"))$
 358 @c continuous_freq (s1, [-2,12,7]);
 359 @c ===end===
 360 @example
 361 (%i1) load ("descriptive")$
 362 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 363 @group
 364 (%i3) continuous_freq (s1, [-2,12,7]);
 365 (%o3) [[- 2, 0, 2, 4, 6, 8, 10, 12], [8, 20, 22, 17, 20, 13, 0]]
 366 @end group
 367 @end example
 368
 369 Optional argument indicates we want the default number of classes with limits
 370 -2 and 12:
 371
 372 @c ===beg===
 373 @c load ("descriptive")$
 374 @c s1 : read_list (file_search ("pidigits.data"))$
 375 @c continuous_freq (s1, [-2,12]);
 376 @c ===end===
 377 @example
 378 (%i1) load ("descriptive")$
 379 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 380 @group
 381 (%i3) continuous_freq (s1, [-2,12]);
 382                3  4  11  18     32  39  46  53
 383 (%o3) [[- 2, - -, -, --, --, 5, --, --, --, --, 12],
 384                5  5  5   5      5   5   5   5
 385                               [0, 8, 20, 12, 18, 9, 8, 25, 0, 0]]
 386 @end group
 387 @end example
 388
 389 The first argument may be an array.
 390
 391 @c ===beg===
 392 @c load ("descriptive")$
 393 @c s1 : read_list (file_search ("pidigits.data"))$
 394 @c a1 : make_array (fixnum, length (s1)) $
 395 @c fillarray (a1, s1);
 396 @c continuous_freq (a1);
 397 @c ===end===
 398 @example
 399 (%i1) load ("descriptive")$
 400 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 401 (%i3) a1 : make_array (fixnum, length (s1)) $
 402 @group
 403 (%i4) fillarray (a1, s1);
 404 (%o4) @{Lisp Array: #(3 1 4 1 5 9 2 6 5 3 5 8 9 7 9 3 2 3 8 4 6 2\
 405  6 4 3 3 8 3 2 7 9 5
 406                0 2 8 8 4 1 9 7 1 6 9 3 9 9 3 7 5 1 0 5 8 2 0 9 7\
 407  4 9 4 4 5 9 2
 408                3 0 7 8 1 6 4 0 6 2 8 6 2 0 8 9 9 8 6 2 8 0 3 4 8\
 409  2 5 3 4 2 1 1
 410                7 0 6 7)@}
 411 @end group
 412 @group
 413 (%i5) continuous_freq (a1);
 414            9   9  27  18  9  27  63  36  81
 415 (%o5) [[0, --, -, --, --, -, --, --, --, --, 9],
 416            10  5  10  5   2  5   10  5   10
 417                              [8, 8, 12, 12, 10, 8, 9, 8, 12, 13]]
 418 @end group
 419 @end example
 420
 421 @opencatbox{Categories:}
 422 @category{Package descriptive}
 423 @closecatbox
 424 @end deffn
 425
 426
 427
 428 @anchor{discrete_freq}
 429 @deffn {Function} discrete_freq (@var{data})
 430 Counts absolute frequencies in discrete samples, both numeric and categorical. Its unique argument is a list,
 431 or 1-dimensional array (as created by @code{make_array}).
 432
 433 @c ===beg===
 434 @c load ("descriptive")$
 435 @c s1 : read_list (file_search ("pidigits.data"))$
 436 @c discrete_freq (s1);
 437 @c ===end===
 438 @example
 439 (%i1) load ("descriptive")$
 440 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 441 @group
 442 (%i3) discrete_freq (s1);
 443 (%o3) [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 444                              [8, 8, 12, 12, 10, 8, 9, 8, 12, 13]]
 445 @end group
 446 @end example
 447
 448 The first list gives the sample values and the second their absolute frequencies. Commands @code{? col} and @code{? transpose} should help you to understand the last input.
 449
 450 The argument may be an array.
 451
 452 @c ===beg===
 453 @c load ("descriptive")$
 454 @c s1 : read_list (file_search ("pidigits.data"))$
 455 @c a1 : make_array (fixnum, length (s1)) $
 456 @c fillarray (a1, s1);
 457 @c discrete_freq (a1);
 458 @c ===end===
 459 @example
 460 (%i1) load ("descriptive")$
 461 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 462 (%i3) a1 : make_array (fixnum, length (s1)) $
 463 @group
 464 (%i4) fillarray (a1, s1);
 465 (%o4) @{Lisp Array: #(3 1 4 1 5 9 2 6 5 3 5 8 9 7 9 3 2 3 8 4 6 2\
 466  6 4 3 3 8 3 2 7 9 5
 467                0 2 8 8 4 1 9 7 1 6 9 3 9 9 3 7 5 1 0 5 8 2 0 9 7\
 468  4 9 4 4 5 9 2
 469                3 0 7 8 1 6 4 0 6 2 8 6 2 0 8 9 9 8 6 2 8 0 3 4 8\
 470  2 5 3 4 2 1 1
 471                7 0 6 7)@}
 472 @end group
 473 @group
 474 (%i5) discrete_freq (a1);
 475 (%o5) [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 476                              [8, 8, 12, 12, 10, 8, 9, 8, 12, 13]]
 477 @end group
 478 @end example
 479
 480 @opencatbox{Categories:}
 481 @category{Package descriptive}
 482 @closecatbox
 483 @end deffn
 484
 485
 486
 487
 488 @anchor{standardize}
 489 @deffn {Function} standardize @
 490 @fname{standardize} (@var{list}) @
 491 @fname{standardize} (@var{matrix})
 492
 493 Subtracts to each element of the list the sample mean and divides
 494 the result by the standard deviation. When the input is a matrix,
 495 @code{standardize} subtracts to each row the multivariate mean, and then
 496 divides each component by the corresponding standard deviation.
 497
 498 @opencatbox{Categories:}
 499 @category{Package descriptive}
 500 @closecatbox
 501 @end deffn
 502
 503
 504
 505
 506 @anchor{subsample}
 507 @deffn {Function} subsample @
 508 @fname{subsample} (@var{data_matrix}, @var{predicate_function}) @
 509 @fname{subsample} (@var{data_matrix}, @var{predicate_function}, @var{col_num1}, @var{col_num2}, ...)
 510
 511 This is a sort of variant of the Maxima @code{submatrix} function.
 512 The first argument is the data matrix, the second is a predicate function
 513 and optional additional arguments are the numbers of the columns to be taken.
 514 Its behaviour is better understood with examples.
 515
 516 These are multivariate records in which the wind speed
 517 in the first meteorological station were greater than 18.
 518 See that in the lambda expression the @var{i}-th component is
 519 referred to as @code{v[i]}.
 520 @c ===beg===
 521 @c load ("descriptive")$
 522 @c s2 : read_matrix (file_search ("wind.data"))$
 523 @c subsample (s2, lambda([v], v[1] > 18));
 524 @c ===end===
 525 @example
 526 (%i1) load ("descriptive")$
 527 (%i2) s2 : read_matrix (file_search ("wind.data"))$
 528 @group
 529 (%i3) subsample (s2, lambda([v], v[1] > 18));
 530               [ 19.38  15.37  15.12  23.09  25.25 ]
 531               [                                   ]
 532               [ 18.29  18.66  19.08  26.08  27.63 ]
 533 (%o3)         [                                   ]
 534               [ 20.25  21.46  19.95  27.71  23.38 ]
 535               [                                   ]
 536               [ 18.79  18.96  14.46  26.38  21.84 ]
 537 @end group
 538 @end example
 539
 540 In the following example, we request only the first, second and fifth
 541 components of those records with wind speeds greater or equal than 16
 542 in station number 1 and less than 25 knots in station number 4. The sample
 543 contains only data from stations 1, 2 and 5. In this case,
 544 the predicate function is defined as an ordinary Maxima function.
 545 @c ===beg===
 546 @c load ("descriptive")$
 547 @c s2 : read_matrix (file_search ("wind.data"))$
 548 @c g(x):= x[1] >= 16 and x[4] < 25$
 549 @c subsample (s2, g, 1, 2, 5);
 550 @c ===end===
 551 @example
 552 (%i1) load ("descriptive")$
 553 (%i2) s2 : read_matrix (file_search ("wind.data"))$
 554 (%i3) g(x):= x[1] >= 16 and x[4] < 25$
 555 @group
 556 (%i4) subsample (s2, g, 1, 2, 5);
 557                      [ 19.38  15.37  25.25 ]
 558                      [                     ]
 559                      [ 17.33  14.67  19.58 ]
 560 (%o4)                [                     ]
 561                      [ 16.92  13.21  21.21 ]
 562                      [                     ]
 563                      [ 17.25  18.46  23.87 ]
 564 @end group
 565 @end example
 566
 567 Here is an example with the categorical variables of @code{biomed.data}.
 568 We want the records corresponding to those patients in group @code{B}
 569 who are older than 38 years.
 570 @c ===beg===
 571 @c load ("descriptive")$
 572 @c s3 : read_matrix (file_search ("biomed.data"))$
 573 @c h(u):= u[1] = B and u[2] > 38 $
 574 @c subsample (s3, h);
 575 @c ===end===
 576 @example
 577 (%i1) load ("descriptive")$
 578 (%i2) s3 : read_matrix (file_search ("biomed.data"))$
 579 (%i3) h(u):= u[1] = B and u[2] > 38 $
 580 @group
 581 (%i4) subsample (s3, h);
 582                 [ B  39  28.0  102.3  17.1  146 ]
 583                 [                               ]
 584                 [ B  39  21.0  92.4   10.3  197 ]
 585                 [                               ]
 586                 [ B  39  23.0  111.5  10.0  133 ]
 587                 [                               ]
 588                 [ B  39  26.0  92.6   12.3  196 ]
 589 (%o4)           [                               ]
 590                 [ B  39  25.0  98.7   10.0  174 ]
 591                 [                               ]
 592                 [ B  39  21.0  93.2   5.9   181 ]
 593                 [                               ]
 594                 [ B  39  18.0  95.0   11.3  66  ]
 595                 [                               ]
 596                 [ B  39  39.0  88.5   7.6   168 ]
 597 @end group
 598 @end example
 599
 600 Probably, the statistical analysis will involve only the blood measures,
 601 @c ===beg===
 602 @c load ("descriptive")$
 603 @c s3 : read_matrix (file_search ("biomed.data"))$
 604 @c subsample (s3, lambda([v], v[1] = B and v[2] > 38),
 605 @c            3, 4, 5, 6);
 606 @c ===end===
 607 @example
 608 (%i1) load ("descriptive")$
 609 (%i2) s3 : read_matrix (file_search ("biomed.data"))$
 610 @group
 611 (%i3) subsample (s3, lambda([v], v[1] = B and v[2] > 38),
 612            3, 4, 5, 6);
 613                    [ 28.0  102.3  17.1  146 ]
 614                    [                        ]
 615                    [ 21.0  92.4   10.3  197 ]
 616                    [                        ]
 617                    [ 23.0  111.5  10.0  133 ]
 618                    [                        ]
 619                    [ 26.0  92.6   12.3  196 ]
 620 (%o3)              [                        ]
 621                    [ 25.0  98.7   10.0  174 ]
 622                    [                        ]
 623                    [ 21.0  93.2   5.9   181 ]
 624                    [                        ]
 625                    [ 18.0  95.0   11.3  66  ]
 626                    [                        ]
 627                    [ 39.0  88.5   7.6   168 ]
 628 @end group
 629 @end example
 630
 631 This is the multivariate mean of @code{s3},
 632 @c ===beg===
 633 @c load ("descriptive")$
 634 @c s3 : read_matrix (file_search ("biomed.data"))$
 635 @c mean (s3);
 636 @c ===end===
 637 @example
 638 (%i1) load ("descriptive")$
 639 (%i2) s3 : read_matrix (file_search ("biomed.data"))$
 640 @group
 641 (%i3) mean (s3);
 642        13 B + 7 A  317
 643 (%o3) [----------, ---, 87.178, 0.06 NA + 81.44999999999999,
 644            20      10
 645                                                     3 NA + 19587
 646                                 18.122999999999998, ------------]
 647                                                         100
 648 @end group
 649 @end example
 650
 651 Here, the first component is meaningless, since @code{A} and @code{B} are categorical, the second component is the mean age of individuals in rational form, and the fourth and last values exhibit some strange behaviour. This is because symbol @code{NA} is used here to indicate @var{non available} data, and the two means are nonsense. A possible solution would be to take out from the matrix those rows with @code{NA} symbols, although this deserves some loss of information.
 652 @c ===beg===
 653 @c load ("descriptive")$
 654 @c s3 : read_matrix (file_search ("biomed.data"))$
 655 @c g(v):= v[4] # NA and v[6] # NA $
 656 @c mean (subsample (s3, g, 3, 4, 5, 6));
 657 @c ===end===
 658 @example
 659 (%i1) load ("descriptive")$
 660 (%i2) s3 : read_matrix (file_search ("biomed.data"))$
 661 (%i3) g(v):= v[4] # NA and v[6] # NA $
 662 @group
 663 (%i4) mean (subsample (s3, g, 3, 4, 5, 6));
 664 (%o4) [79.4923076923077, 86.2032967032967, 16.93186813186813,
 665                                                             2514
 666                                                             ----]
 667                                                              13
 668 @end group
 669 @end example
 670
 671 @opencatbox{Categories:}
 672 @category{Package descriptive}
 673 @closecatbox
 674 @end deffn
 675
 676
 677
 678
 679
 680 @anchor{transform_sample}
 681 @deffn {Function} transform_sample (@var{matrix}, @var{varlist}, @var{exprlist})
 682
 683 Transforms the sample @var{matrix}, where each column is called according to
 684 @var{varlist}, following expressions in @var{exprlist}.
 685
 686 Examples:
 687
 688 The second argument assigns names to the three columns. With these names,
 689 a list of expressions define the transformation of the sample.
 690
 691 @example
 692 (%i1) load ("descriptive")$
 693 (%i2) data: matrix([3,2,7],[3,7,2],[8,2,4],[5,2,4]) $
 694 @group
 695 (%i3) transform_sample(data, [a,b,c], [c, a*b, log(a)]);
 696                                [ 7  6   log(3) ]
 697                                [               ]
 698                                [ 2  21  log(3) ]
 699 (%o3)                          [               ]
 700                                [ 4  16  log(8) ]
 701                                [               ]
 702                                [ 4  10  log(5) ]
 703 @end group
 704 @end example
 705
 706 Add a constant column and remove the third variable.
 707
 708 @example
 709 (%i1) load ("descriptive")$
 710 (%i2) data: matrix([3,2,7],[3,7,2],[8,2,4],[5,2,4]) $
 711 (%i3) transform_sample(data, [a,b,c], [makelist(1,k,length(data)),a,b]);
 712 @group
 713                                   [ 1  3  2 ]
 714                                   [         ]
 715                                   [ 1  3  7 ]
 716 (%o3)                             [         ]
 717                                   [ 1  8  2 ]
 718                                   [         ]
 719                                   [ 1  5  2 ]
 720 @end group
 721 @end example
 722
 723 @opencatbox{Categories:}
 724 @category{Package descriptive}
 725 @closecatbox
 726 @end deffn
 727
 728
 729
 730
 731
 732
 733
 734 @node Functions and Variables for descriptive statistics, Functions and Variables for statistical graphs, Functions and Variables for data manipulation, descriptive-pkg
 735 @section Functions and Variables for descriptive statistics
 736
 737
 738
 739 @anchor{mean}
 740 @deffn {Function} mean @
 741 @fname{mean} (@var{list}) @
 742 @fname{mean} (@var{matrix})
 743
 744 This is the sample mean, defined as
 745 @ifnottex
 746 @example
 747                        n
 748                      ====
 749              _   1   \
 750              x = -    >    x
 751                  n   /      i
 752                      ====
 753                      i = 1
 754 @end example
 755 @end ifnottex
 756 @tex
 757 $${\bar{x}={1\over{n}}{\sum_{i=1}^{n}{x_{i}}}}$$
 758 @end tex
 759
 760 Example:
 761
 762 @c ===beg===
 763 @c load ("descriptive")$
 764 @c s1 : read_list (file_search ("pidigits.data"))$
 765 @c mean (s1);
 766 @c %, numer;
 767 @c s2 : read_matrix (file_search ("wind.data"))$
 768 @c mean (s2);
 769 @c ===end===
 770 @example
 771 (%i1) load ("descriptive")$
 772 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 773 @group
 774 (%i3) mean (s1);
 775                                471
 776 (%o3)                          ---
 777                                100
 778 @end group
 779 @group
 780 (%i4) %, numer;
 781 (%o4)                         4.71
 782 @end group
 783 (%i5) s2 : read_matrix (file_search ("wind.data"))$
 784 @group
 785 (%i6) mean (s2);
 786 (%o6) [9.9485, 10.160700000000004, 10.868499999999997,
 787                           15.716600000000001, 14.844100000000001]
 788 @end group
 789 @end example
 790
 791 @opencatbox{Categories:}
 792 @category{Package descriptive}
 793 @closecatbox
 794 @end deffn
 795
 796
 797
 798 @anchor{var}
 799 @deffn {Function} var @
 800 @fname{var} (@var{list}) @
 801 @fname{var} (@var{matrix})
 802
 803 This is the sample variance, defined as
 804 @ifnottex
 805 @example
 806 @group
 807                      n
 808                    ====
 809            2   1   \          _ 2
 810           s  = -    >    (x - x)
 811                n   /       i
 812                    ====
 813                    i = 1
 814 @end group
 815 @end example
 816 @end ifnottex
 817 @tex
 818 $${{1}\over{n}}{\sum_{i=1}^{n}{(x_{i}-\bar{x})^2}}$$
 819 @end tex
 820
 821 Example:
 822
 823 @c ===beg===
 824 @c load ("descriptive")$
 825 @c s1 : read_list (file_search ("pidigits.data"))$
 826 @c var (s1), numer;
 827 @c ===end===
 828 @example
 829 (%i1) load ("descriptive")$
 830 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 831 @group
 832 (%i3) var (s1), numer;
 833 (%o3)                   8.425899999999999
 834 @end group
 835 @end example
 836
 837 See also function @mrefdot{var1}
 838
 839 @opencatbox{Categories:}
 840 @category{Package descriptive}
 841 @closecatbox
 842 @end deffn
 843
 844
 845
 846 @anchor{var1}
 847 @deffn {Function} var1 @
 848 @fname{var1} (@var{list}) @
 849 @fname{var1} (@var{matrix})
 850
 851 This is the sample variance, defined as
 852 @ifnottex
 853 @example
 854 @group
 855                      n
 856                    ====
 857                1   \          _ 2
 858               ---   >    (x - x)
 859               n-1  /       i
 860                    ====
 861                    i = 1
 862 @end group
 863 @end example
 864 @end ifnottex
 865 @tex
 866 $${{1\over{n-1}}{\sum_{i=1}^{n}{(x_{i}-\bar{x})^2}}}$$
 867 @end tex
 868
 869 Example:
 870
 871 @c ===beg===
 872 @c load ("descriptive")$
 873 @c s1 : read_list (file_search ("pidigits.data"))$
 874 @c var1 (s1), numer;
 875 @c s2 : read_matrix (file_search ("wind.data"))$
 876 @c var1 (s2);
 877 @c ===end===
 878 @example
 879 (%i1) load ("descriptive")$
 880 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 881 @group
 882 (%i3) var1 (s1), numer;
 883 (%o3)                    8.5110101010101
 884 @end group
 885 (%i4) s2 : read_matrix (file_search ("wind.data"))$
 886 @group
 887 (%i5) var1 (s2);
 888 (%o5) [17.395865404040414, 15.139127787878794,
 889        15.632049242424243, 32.50152569696971, 24.669773929292937]
 890 @end group
 891 @end example
 892
 893 See also function @mrefdot{var}
 894
 895 @opencatbox{Categories:}
 896 @category{Package descriptive}
 897 @closecatbox
 898 @end deffn
 899
 900
 901
 902 @anchor{std}
 903 @deffn {Function} std @
 904 @fname{std} (@var{list}) @
 905 @fname{std} (@var{matrix})
 906
 907 This is the square root of the function @code{var}, the variance with denominator @math{n}.
 908
 909 Example:
 910
 911 @c ===beg===
 912 @c load ("descriptive")$
 913 @c s1 : read_list (file_search ("pidigits.data"))$
 914 @c std (s1), numer;
 915 @c s2 : read_matrix (file_search ("wind.data"))$
 916 @c std (s2);
 917 @c ===end===
 918 @example
 919 (%i1) load ("descriptive")$
 920 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 921 @group
 922 (%i3) std (s1), numer;
 923 (%o3)                  2.9027400848164135
 924 @end group
 925 (%i4) s2 : read_matrix (file_search ("wind.data"))$
 926 @group
 927 (%i5) std (s2);
 928 (%o5) [4.149928523480858, 3.8713998127292415,
 929         3.9339202775348663, 5.672434260526957, 4.941970881136392]
 930 @end group
 931 @end example
 932
 933 See also functions @mref{var} and @mrefdot{std1}
 934
 935 @opencatbox{Categories:}
 936 @category{Package descriptive}
 937 @closecatbox
 938 @end deffn
 939
 940
 941
 942 @anchor{std1}
 943 @deffn {Function} std1 @
 944 @fname{std1} (@var{list}) @
 945 @fname{std1} (@var{matrix})
 946
 947 This is the square root of the function @mrefcomma{var1} the variance with denominator @math{n-1}.
 948
 949 Example:
 950
 951 @c ===beg===
 952 @c load ("descriptive")$
 953 @c s1 : read_list (file_search ("pidigits.data"))$
 954 @c std1 (s1), numer;
 955 @c s2 : read_matrix (file_search ("wind.data"))$
 956 @c std1 (s2);
 957 @c ===end===
 958 @example
 959 (%i1) load ("descriptive")$
 960 (%i2) s1 : read_list (file_search ("pidigits.data"))$
 961 @group
 962 (%i3) std1 (s1), numer;
 963 (%o3)                   2.917363553109228
 964 @end group
 965 (%i4) s2 : read_matrix (file_search ("wind.data"))$
 966 @group
 967 (%i5) std1 (s2);
 968 (%o5) [4.170835096721089, 3.8909032097803196,
 969         3.9537386411375555, 5.701010936401517, 4.966867617451963]
 970 @end group
 971 @end example
 972
 973 See also functions @mref{var1} and @mrefdot{std}
 974
 975 @opencatbox{Categories:}
 976 @category{Package descriptive}
 977 @closecatbox
 978 @end deffn
 979
 980
 981
 982 @anchor{noncentral_moment}
 983 @deffn {Function} noncentral_moment @
 984 @fname{noncentral_moment} (@var{list}, @var{k}) @
 985 @fname{noncentral_moment} (@var{matrix}, @var{k})
 986
 987 The non central moment of order @math{k}, defined as
 988 @ifnottex
 989 @example
 990 @group
 991                        n
 992                      ====
 993                  1   \      k
 994                  -    >    x
 995                  n   /      i
 996                      ====
 997                      i = 1
 998 @end group
 999 @end example
1000 @end ifnottex
1001 @tex
1002 $${{1\over{n}}{\sum_{i=1}^{n}{x_{i}^k}}}$$
1003 @end tex
1004
1005 Example:
1006
1007 Input is a list.
1008 The first noncentral moment is equal to the sample mean.
1009 @c ===beg===
1010 @c load ("descriptive")$
1011 @c s1 : read_list (file_search ("pidigits.data"))$
1012 @c noncentral_moment (s1, 1), numer;
1013 @c mean (s1), numer;
1014 @c ===end===
1015 @example
1016 (%i1) load ("descriptive")$
1017 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1018 @group
1019 (%i3) noncentral_moment (s1, 1), numer;
1020 (%o3)                         4.71
1021 @end group
1022 @group
1023 (%i4) mean (s1), numer;
1024 (%o4)                         4.71
1025 @end group
1026 @end example
1027
1028 Input is a matrix.
1029 Calculation of the fifth noncentral moment for each column.
1030 @c ===beg===
1031 @c load ("descriptive")$
1032 @c s2 : read_matrix (file_search ("wind.data"))$
1033 @c noncentral_moment (s2, 5);
1034 @c ===end===
1035 @example
1036 (%i1) load ("descriptive")$
1037 (%i2) s2 : read_matrix (file_search ("wind.data"))$
1038 @group
1039 (%i3) noncentral_moment (s2, 5);
1040 (%o3) [319793.87247615046, 320532.19238924625,
1041        391249.56213815557, 2502278.205988911, 1691881.7977422548]
1042 @end group
1043 @end example
1044
1045 See also function @mrefdot{central_moment}
1046
1047 @opencatbox{Categories:}
1048 @category{Package descriptive}
1049 @closecatbox
1050 @end deffn
1051
1052
1053
1054 @anchor{central_moment}
1055 @deffn {Function} central_moment @
1056 @fname{central_moment} (@var{list}, @var{k}) @
1057 @fname{central_moment} (@var{matrix}, @var{k})
1058
1059 The central moment of order @math{k}, defined as
1060 @ifnottex
1061 @example
1062 @group
1063                     n
1064                   ====
1065               1   \          _ k
1066               -    >    (x - x)
1067               n   /       i
1068                   ====
1069                   i = 1
1070 @end group
1071 @end example
1072 @end ifnottex
1073 @tex
1074 $${{1\over{n}}{\sum_{i=1}^{n}{(x_{i}-\bar{x})^k}}}$$
1075 @end tex
1076
1077 Example:
1078
1079 Input is a list.
1080 The second central moment is equal to the sample variance.
1081 @c ===beg===
1082 @c load ("descriptive")$
1083 @c s1 : read_list (file_search ("pidigits.data"))$
1084 @c central_moment (s1, 2), numer;
1085 @c var (s1), numer;
1086 @c ===end===
1087 @example
1088 (%i1) load ("descriptive")$
1089 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1090 @group
1091 (%i3) central_moment (s1, 2), numer;
1092 (%o3)                   8.425899999999999
1093 @end group
1094 @group
1095 (%i4) var (s1), numer;
1096 (%o4)                   8.425899999999999
1097 @end group
1098 @end example
1099
1100 Input is a matrix.
1101 Calculation of the third central moment.
1102 @c ===end===
1103 @c load ("descriptive")$
1104 @c s2 : read_matrix (file_search ("wind.data"))$
1105 @c central_moment (s2, 3);
1106 @c ===end===
1107 @example
1108 (%i1) load ("descriptive")$
1109 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1110 @group
1111 (%i3) central_moment (s1, 2), numer; /* the variance */
1112 (%o3)                   8.425899999999999
1113 @end group
1114 (%i5) s2 : read_matrix (file_search ("wind.data"))$
1115 @group
1116 (%i6) central_moment (s2, 3);
1117 (%o6) [11.29584771375004, 16.97988248298583, 5.626661952750102,
1118                              37.5986572057918, 25.85981904394192]
1119 @end group
1120 @end example
1121
1122 See also functions @mref{central_moment} and @mrefdot{mean}
1123
1124 @opencatbox{Categories:}
1125 @category{Package descriptive}
1126 @closecatbox
1127 @end deffn
1128
1129
1130
1131 @anchor{cv}
1132 @deffn {Function} cv @
1133 @fname{cv} (@var{list}) @
1134 @fname{cv} (@var{matrix})
1135
1136 The variation coefficient is the quotient between the sample standard deviation (@mref{std}) and the @mrefcomma{mean}
1137
1138 @c ===beg===
1139 @c load ("descriptive")$
1140 @c s1 : read_list (file_search ("pidigits.data"))$
1141 @c cv (s1), numer;
1142 @c s2 : read_matrix (file_search ("wind.data"))$
1143 @c cv (s2);
1144 @c ===end===
1145 @example
1146 (%i1) load ("descriptive")$
1147 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1148 @group
1149 (%i3) cv (s1), numer;
1150 (%o3)                  0.6162930116383044
1151 @end group
1152 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1153 @group
1154 (%i5) cv (s2);
1155 (%o5) [0.4171411291632767, 0.38101703748061055,
1156       0.3619561372346568, 0.3609199356430116, 0.3329249251309538]
1157 @end group
1158 @end example
1159
1160 See also functions @mref{std} and @mrefdot{mean}
1161
1162 @opencatbox{Categories:}
1163 @category{Package descriptive}
1164 @closecatbox
1165 @end deffn
1166
1167
1168
1169 @anchor{smin}
1170 @deffn {Function} smin @
1171 @fname{smin} (@var{list}) @
1172 @fname{smin} (@var{matrix})
1173
1174 This is the minimum value of the sample @var{list}.
1175 When the argument is a matrix, @mref{smin} returns
1176 a list containing the minimum values of the columns,
1177 which are associated to statistical variables.
1178
1179 @c ===beg===
1180 @c load ("descriptive")$
1181 @c s1 : read_list (file_search ("pidigits.data"))$
1182 @c smin (s1);
1183 @c s2 : read_matrix (file_search ("wind.data"))$
1184 @c smin (s2);
1185 @c ===end===
1186 @example
1187 (%i1) load ("descriptive")$
1188 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1189 @group
1190 (%i3) smin (s1);
1191 (%o3)                           0
1192 @end group
1193 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1194 @group
1195 (%i5) smin (s2);
1196 (%o5)             [0.58, 0.5, 2.67, 5.25, 5.17]
1197 @end group
1198 @end example
1199
1200 See also function @mrefdot{smax}
1201
1202 @opencatbox{Categories:}
1203 @category{Package descriptive}
1204 @closecatbox
1205 @end deffn
1206
1207
1208
1209 @anchor{smax}
1210 @deffn {Function} smax @
1211 @fname{smax} (@var{list}) @
1212 @fname{smax} (@var{matrix})
1213
1214 This is the maximum value of the sample @var{list}.
1215 When the argument is a matrix, @mref{smax} returns
1216 a list containing the maximum values of the columns,
1217 which are associated to statistical variables.
1218
1219 @c ===beg===
1220 @c load ("descriptive")$
1221 @c s1 : read_list (file_search ("pidigits.data"))$
1222 @c smax (s1);
1223 @c s2 : read_matrix (file_search ("wind.data"))$
1224 @c smax (s2);
1225 @c ===end===
1226 @example
1227 (%i1) load ("descriptive")$
1228 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1229 @group
1230 (%i3) smax (s1);
1231 (%o3)                           9
1232 @end group
1233 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1234 @group
1235 (%i5) smax (s2);
1236 (%o5)          [20.25, 21.46, 20.04, 29.63, 27.63]
1237 @end group
1238 @end example
1239
1240 See also function @mrefdot{smin}
1241
1242 @opencatbox{Categories:}
1243 @category{Package descriptive}
1244 @closecatbox
1245 @end deffn
1246
1247
1248
1249 @anchor{range}
1250 @deffn {Function} range @
1251 @fname{range} (@var{list}) @
1252 @fname{range} (@var{matrix})
1253
1254 The range is the difference between the extreme values.
1255
1256 Example:
1257
1258 @c ===beg===
1259 @c load ("descriptive")$
1260 @c s1 : read_list (file_search ("pidigits.data"))$
1261 @c range (s1);
1262 @c s2 : read_matrix (file_search ("wind.data"))$
1263 @c range (s2);
1264 @c ===end===
1265 @example
1266 (%i1) load ("descriptive")$
1267 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1268 @group
1269 (%i3) range (s1);
1270 (%o3)                           9
1271 @end group
1272 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1273 @group
1274 (%i5) range (s2);
1275 (%o5)   [19.67, 20.96, 17.369999999999997, 24.38, 22.46]
1276 @end group
1277 @end example
1278
1279 @opencatbox{Categories:}
1280 @category{Package descriptive}
1281 @closecatbox
1282 @end deffn
1283
1284
1285
1286 @anchor{quantile}
1287 @deffn {Function} quantile @
1288 @fname{quantile} (@var{list}, @var{p}) @
1289 @fname{quantile} (@var{matrix}, @var{p})
1290
1291 This is the @var{p}-quantile, with @var{p} a number in @math{[0, 1]}, of the sample @var{list}.
1292 Although there are several definitions for the sample quantile (Hyndman, R. J., Fan, Y. (1996) @var{Sample quantiles in statistical packages}. American Statistician, 50, 361-365), the one based on linear interpolation is implemented in package @ref{descriptive-pkg}
1293
1294 Example:
1295
1296 Input is a list. First and third quartiles are computed.
1297
1298 @c ===beg===
1299 @c load ("descriptive")$
1300 @c s1 : read_list (file_search ("pidigits.data"))$
1301 @c [quantile (s1, 1/4), quantile (s1, 3/4)], numer;
1302 @c ===end===
1303 @example
1304 (%i1) load ("descriptive")$
1305 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1306 @group
1307 (%i3) [quantile (s1, 1/4), quantile (s1, 3/4)], numer;
1308 (%o3)                      [2.0, 7.25]
1309 @end group
1310 @end example
1311
1312 Input is a matrix. First quartile is computed for each column.
1313
1314 @c ===beg===
1315 @c load ("descriptive")$
1316 @c s2 : read_matrix (file_search ("wind.data"))$
1317 @c quantile (s2, 1/4);
1318 @c ===end===
1319 @example
1320 (%i1) load ("descriptive")$
1321 (%i2) s2 : read_matrix (file_search ("wind.data"))$
1322 @group
1323 (%i3) quantile (s2, 1/4);
1324 (%o3)    [7.2575, 7.477500000000001, 7.82, 11.28, 11.48]
1325 @end group
1326 @end example
1327
1328 @opencatbox{Categories:}
1329 @category{Package descriptive}
1330 @closecatbox
1331 @end deffn
1332
1333
1334
1335 @anchor{median}
1336 @deffn {Function} median @
1337 @fname{median} (@var{list}) @
1338 @fname{median} (@var{matrix})
1339
1340 Once the sample is ordered, if the sample size is odd the median is the central value, otherwise it is the mean of the two central values.
1341
1342 Example:
1343
1344 @c ===beg===
1345 @c load ("descriptive")$
1346 @c s1 : read_list (file_search ("pidigits.data"))$
1347 @c median (s1);
1348 @c s2 : read_matrix (file_search ("wind.data"))$
1349 @c median (s2);
1350 @c ===end===
1351 @example
1352 (%i1) load ("descriptive")$
1353 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1354 @group
1355 (%i3) median (s1);
1356                                 9
1357 (%o3)                           -
1358                                 2
1359 @end group
1360 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1361 @group
1362 (%i5) median (s2);
1363 (%o5)   [10.059999999999999, 9.855, 10.73, 15.48, 14.105]
1364 @end group
1365 @end example
1366
1367 The median is the 1/2-quantile.
1368
1369 See also function @mrefdot{quantile}
1370
1371 @opencatbox{Categories:}
1372 @category{Package descriptive}
1373 @closecatbox
1374 @end deffn
1375
1376
1377
1378 @anchor{qrange}
1379 @deffn {Function} qrange @
1380 @fname{qrange} (@var{list}) @
1381 @fname{qrange} (@var{matrix})
1382
1383 The interquartilic range is the difference between the third and first quartiles, @code{quantile(@var{list},3/4) - quantile(@var{list},1/4)},
1384
1385 @c ===beg===
1386 @c load ("descriptive")$
1387 @c s1 : read_list (file_search ("pidigits.data"))$
1388 @c qrange (s1);
1389 @c s2 : read_matrix (file_search ("wind.data"))$
1390 @c qrange (s2);
1391 @c ===end===
1392 @example
1393 (%i1) load ("descriptive")$
1394 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1395 @group
1396 (%i3) qrange (s1);
1397                                21
1398 (%o3)                          --
1399                                4
1400 @end group
1401 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1402 @group
1403 (%i5) qrange (s2);
1404 (%o5) [5.385, 5.572499999999998, 6.022500000000001,
1405                             8.729999999999999, 6.649999999999999]
1406 @end group
1407 @end example
1408
1409 See also function @mrefdot{quantile}
1410
1411 @opencatbox{Categories:}
1412 @category{Package descriptive}
1413 @closecatbox
1414 @end deffn
1415
1416
1417
1418 @anchor{mean_deviation}
1419 @deffn {Function} mean_deviation @
1420 @fname{mean_deviation} (@var{list}) @
1421 @fname{mean_deviation} (@var{matrix})
1422
1423 The mean deviation, defined as
1424 @ifnottex
1425 @example
1426 @group
1427                      n
1428                    ====
1429                1   \          _
1430                -    >    |x - x|
1431                n   /       i
1432                    ====
1433                    i = 1
1434 @end group
1435 @end example
1436 @end ifnottex
1437 @tex
1438 $${{1\over{n}}{\sum_{i=1}^{n}{|x_{i}-\bar{x}|}}}$$
1439 @end tex
1440
1441 Example:
1442
1443 @c ===beg===
1444 @c load ("descriptive")$
1445 @c s1 : read_list (file_search ("pidigits.data"))$
1446 @c mean_deviation (s1);
1447 @c s2 : read_matrix (file_search ("wind.data"))$
1448 @c mean_deviation (s2);
1449 @c ===end===
1450 @example
1451 (%i1) load ("descriptive")$
1452 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1453 @group
1454 (%i3) mean_deviation (s1);
1455                                51
1456 (%o3)                          --
1457                                20
1458 @end group
1459 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1460 @group
1461 (%i5) mean_deviation (s2);
1462 (%o5) [3.2879599999999987, 3.075342, 3.2390700000000003,
1463                             4.715664000000001, 4.028546000000002]
1464 @end group
1465 @end example
1466
1467 See also function @mrefdot{mean}
1468
1469 @opencatbox{Categories:}
1470 @category{Package descriptive}
1471 @closecatbox
1472 @end deffn
1473
1474
1475
1476 @anchor{median_deviation}
1477 @deffn {Function} median_deviation @
1478 @fname{median_deviation} (@var{list}) @
1479 @fname{median_deviation} (@var{matrix})
1480
1481 The median deviation, defined as
1482 @ifnottex
1483 @example
1484 @group
1485                  n
1486                ====
1487            1   \
1488            -    >    |x - med|
1489            n   /       i
1490                ====
1491                i = 1
1492 @end group
1493 @end example
1494 @end ifnottex
1495 @tex
1496 $${{1\over{n}}{\sum_{i=1}^{n}{|x_{i}-med|}}}$$
1497 @end tex
1498 where @code{med} is the median of @var{list}.
1499
1500 Example:
1501
1502 @c ===beg===
1503 @c load ("descriptive")$
1504 @c s1 : read_list (file_search ("pidigits.data"))$
1505 @c median_deviation (s1);
1506 @c s2 : read_matrix (file_search ("wind.data"))$
1507 @c median_deviation (s2);
1508 @c ===end===
1509 @example
1510 (%i1) load ("descriptive")$
1511 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1512 @group
1513 (%i3) median_deviation (s1);
1514                                 5
1515 (%o3)                           -
1516                                 2
1517 @end group
1518 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1519 @group
1520 (%i5) median_deviation (s2);
1521 (%o5) [2.75, 2.7550000000000003, 3.08, 4.315, 3.3099999999999996]
1522 @end group
1523 @end example
1524
1525 See also function @mrefdot{mean}
1526
1527 @opencatbox{Categories:}
1528 @category{Package descriptive}
1529 @closecatbox
1530 @end deffn
1531
1532
1533
1534 @anchor{harmonic_mean}
1535 @deffn {Function} harmonic_mean @
1536 @fname{harmonic_mean} (@var{list}) @
1537 @fname{harmonic_mean} (@var{matrix})
1538
1539 The harmonic mean, defined as
1540 @ifnottex
1541 @example
1542 @group
1543                   n
1544                --------
1545                 n
1546                ====
1547                \     1
1548                 >    --
1549                /     x
1550                ====   i
1551                i = 1
1552 @end group
1553 @end example
1554 @end ifnottex
1555 @tex
1556 $${{n}\over{\sum_{i=1}^{n}{{{1}\over{x_{i}}}}}}$$
1557 @end tex
1558
1559 Example:
1560
1561 @c ===beg===
1562 @c load ("descriptive")$
1563 @c y : [5, 7, 2, 5, 9, 5, 6, 4, 9, 2, 4, 2, 5]$
1564 @c harmonic_mean (y), numer;
1565 @c s2 : read_matrix (file_search ("wind.data"))$
1566 @c harmonic_mean (s2);
1567 @c ===end===
1568 @example
1569 (%i1) load ("descriptive")$
1570 (%i2) y : [5, 7, 2, 5, 9, 5, 6, 4, 9, 2, 4, 2, 5]$
1571 @group
1572 (%i3) harmonic_mean (y), numer;
1573 (%o3)                  3.9018580276322052
1574 @end group
1575 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1576 @group
1577 (%i5) harmonic_mean (s2);
1578 (%o5) [6.948015590052786, 7.391967752360356, 9.055658197151745,
1579                            13.441990281936924, 13.01439145898509]
1580 @end group
1581 @end example
1582
1583 See also functions @mref{mean} and @mrefdot{geometric_mean}
1584
1585 @opencatbox{Categories:}
1586 @category{Package descriptive}
1587 @closecatbox
1588
1589 @end deffn
1590
1591
1592
1593 @anchor{geometric_mean}
1594 @deffn {Function} geometric_mean @
1595 @fname{geometric_mean} (@var{list}) @
1596 @fname{geometric_mean} (@var{matrix})
1597
1598 The geometric mean, defined as
1599 @ifnottex
1600 @example
1601 @group
1602                  /  n      \ 1/n
1603                  | /===\   |
1604                  |  ! !    |
1605                  |  ! !  x |
1606                  |  ! !   i|
1607                  | i = 1   |
1608                  \         /
1609 @end group
1610 @end example
1611 @end ifnottex
1612 @tex
1613 $$\left(\prod_{i=1}^{n}{x_{i}}\right)^{{{1}\over{n}}}$$
1614 @end tex
1615
1616 Example:
1617
1618 @c ===beg===
1619 @c load ("descriptive")$
1620 @c y : [5, 7, 2, 5, 9, 5, 6, 4, 9, 2, 4, 2, 5]$
1621 @c geometric_mean (y), numer;
1622 @c s2 : read_matrix (file_search ("wind.data"))$
1623 @c geometric_mean (s2);
1624 @c ===end===
1625 @example
1626 (%i1) load ("descriptive")$
1627 (%i2) y : [5, 7, 2, 5, 9, 5, 6, 4, 9, 2, 4, 2, 5]$
1628 @group
1629 (%i3) geometric_mean (y), numer;
1630 (%o3)                   4.454845412337012
1631 @end group
1632 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1633 @group
1634 (%i5) geometric_mean (s2);
1635 (%o5) [8.82476274347979, 9.22652604739361, 10.044267571488904,
1636                            14.612741263490207, 13.96184163444275]
1637 @end group
1638 @end example
1639
1640 See also functions @mref{mean} and @mrefdot{harmonic_mean}
1641
1642 @opencatbox{Categories:}
1643 @category{Package descriptive}
1644 @closecatbox
1645 @end deffn
1646
1647
1648
1649 @anchor{kurtosis}
1650 @deffn {Function} kurtosis @
1651 @fname{kurtosis} (@var{list}) @
1652 @fname{kurtosis} (@var{matrix})
1653
1654 The kurtosis coefficient, defined as
1655 @ifnottex
1656 @example
1657 @group
1658                     n
1659                   ====
1660             1     \          _ 4
1661            ----    >    (x - x)  - 3
1662               4   /       i
1663            n s    ====
1664                   i = 1
1665 @end group
1666 @end example
1667 @end ifnottex
1668 @tex
1669 $${{1\over{n s^4}}{\sum_{i=1}^{n}{(x_{i}-\bar{x})^4}}-3}$$
1670 @end tex
1671
1672 Example:
1673
1674 @c ===beg===
1675 @c load ("descriptive")$
1676 @c s1 : read_list (file_search ("pidigits.data"))$
1677 @c kurtosis (s1), numer;
1678 @c s2 : read_matrix (file_search ("wind.data"))$
1679 @c kurtosis (s2);
1680 @c ===end===
1681 @example
1682 (%i1) load ("descriptive")$
1683 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1684 @group
1685 (%i3) kurtosis (s1), numer;
1686 (%o3)                  - 1.273247946514421
1687 @end group
1688 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1689 @group
1690 (%i5) kurtosis (s2);
1691 (%o5) [- 0.2715445622195385, 0.119998784429451,
1692 - 0.42752334904828615, - 0.6405361979019522,
1693 - 0.4952382132352935]
1694 @end group
1695 @end example
1696
1697 See also functions @mrefcomma{mean} @mref{var} and @mrefdot{skewness}
1698
1699 @opencatbox{Categories:}
1700 @category{Package descriptive}
1701 @closecatbox
1702 @end deffn
1703
1704
1705
1706 @anchor{skewness}
1707 @deffn {Function} skewness @
1708 @fname{skewness} (@var{list}) @
1709 @fname{skewness} (@var{matrix})
1710
1711 The skewness coefficient, defined as
1712 @ifnottex
1713 @example
1714 @group
1715                     n
1716                   ====
1717             1     \          _ 3
1718            ----    >    (x - x)
1719               3   /       i
1720            n s    ====
1721                   i = 1
1722 @end group
1723 @end example
1724 @end ifnottex
1725 @tex
1726 $${{1\over{n s^3}}{\sum_{i=1}^{n}{(x_{i}-\bar{x})^3}}}$$
1727 @end tex
1728
1729 Example:
1730
1731 @c ===beg===
1732 @c load ("descriptive")$
1733 @c s1 : read_list (file_search ("pidigits.data"))$
1734 @c skewness (s1), numer;
1735 @c s2 : read_matrix (file_search ("wind.data"))$
1736 @c skewness (s2);
1737 @c ===end===
1738 @example
1739 (%i1) load ("descriptive")$
1740 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1741 @group
1742 (%i3) skewness (s1), numer;
1743 (%o3)                 0.009196180476450424
1744 @end group
1745 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1746 @group
1747 (%i5) skewness (s2);
1748 (%o5) [0.1580509020000978, 0.2926379232061854,
1749    0.09242174416107717, 0.20599843481486865, 0.21425202488908313]
1750 @end group
1751 @end example
1752
1753 See also functions @mrefcomma{mean}, @mref{var} and @mrefdot{kurtosis}
1754
1755 @opencatbox{Categories:}
1756 @category{Package descriptive}
1757 @closecatbox
1758 @end deffn
1759
1760
1761
1762 @anchor{pearson_skewness}
1763 @deffn {Function} pearson_skewness @
1764 @fname{pearson_skewness} (@var{list}) @
1765 @fname{pearson_skewness} (@var{matrix})
1766
1767 Pearson's skewness coefficient, defined as
1768 @ifnottex
1769 @example
1770 @group
1771                 _
1772              3 (x - med)
1773              -----------
1774                   s
1775 @end group
1776 @end example
1777 @end ifnottex
1778 @tex
1779 $${{3\,\left(\bar{x}-med\right)}\over{s}}$$
1780 @end tex
1781 where @var{med} is the median of @var{list}.
1782
1783 Example:
1784
1785 @c ===beg===
1786 @c load ("descriptive")$
1787 @c s1 : read_list (file_search ("pidigits.data"))$
1788 @c pearson_skewness (s1), numer;
1789 @c s2 : read_matrix (file_search ("wind.data"))$
1790 @c pearson_skewness (s2);
1791 @c ===end===
1792 @example
1793 (%i1) load ("descriptive")$
1794 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1795 @group
1796 (%i3) pearson_skewness (s1), numer;
1797 (%o3)                  0.21594840290938955
1798 @end group
1799 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1800 @group
1801 (%i5) pearson_skewness (s2);
1802 (%o5) [- 0.08019976629211892, 0.2357036272952649,
1803    0.10509040624912039, 0.12450423405923679, 0.44641817958045193]
1804 @end group
1805 @end example
1806
1807 See also functions @mrefcomma{mean} @mref{var} and @mrefdot{median}
1808
1809 @opencatbox{Categories:}
1810 @category{Package descriptive}
1811 @closecatbox
1812 @end deffn
1813
1814
1815
1816 @anchor{quartile_skewness}
1817 @deffn {Function} quartile_skewness @
1818 @fname{quartile_skewness} (@var{list}) @
1819 @fname{quartile_skewness} (@var{matrix})
1820
1821 The quartile skewness coefficient, defined as
1822 @ifnottex
1823 @example
1824 @group
1825                c    - 2 c    + c
1826                 3/4      1/2    1/4
1827                --------------------
1828                    c    - c
1829                     3/4    1/4
1830 @end group
1831 @end example
1832 @end ifnottex
1833 @tex
1834 $${{c_{{{3}\over{4}}}-2\,c_{{{1}\over{2}}}+c_{{{1}\over{4}}}}\over{c
1835  _{{{3}\over{4}}}-c_{{{1}\over{4}}}}}$$
1836 @end tex
1837 where @math{c_p} is the @var{p}-quantile of sample @var{list}.
1838
1839 Example:
1840
1841 @c ===beg===
1842 @c load ("descriptive")$
1843 @c s1 : read_list (file_search ("pidigits.data"))$
1844 @c quartile_skewness (s1), numer;
1845 @c s2 : read_matrix (file_search ("wind.data"))$
1846 @c quartile_skewness (s2);
1847 @c ===end===
1848 @example
1849 (%i1) load ("descriptive")$
1850 (%i2) s1 : read_list (file_search ("pidigits.data"))$
1851 @group
1852 (%i3) quartile_skewness (s1), numer;
1853 (%o3)                 0.047619047619047616
1854 @end group
1855 (%i4) s2 : read_matrix (file_search ("wind.data"))$
1856 @group
1857 (%i5) quartile_skewness (s2);
1858 (%o5) [- 0.040854224698235304, 0.14670255720053824,
1859    0.033623910336239196, 0.03780068728522298, 0.2105263157894735]
1860 @end group
1861 @end example
1862
1863 See also function @mrefdot{quantile}
1864
1865 @opencatbox{Categories:}
1866 @category{Package descriptive}
1867 @closecatbox
1868 @end deffn
1869
1870
1871
1872 @anchor{km}
1873 @deffn {Function} km @
1874 @fname{km} (@var{list}, @var{option} ...) @
1875 @fname{km} (@var{matrix}, @var{option} ...)
1876
1877 Kaplan Meier estimator of the survival, or reliability, function @math{S(x)=1-F(x)}.
1878
1879 Data can be introduced as a list of pairs, or as a two column matrix. The first
1880 component is the observed time, and the second component a censoring index
1881 (1 = non censored, 0 = right censored).
1882
1883 The optional argument is the name of the variable in the returned expression,
1884 which is @var{x} by default.
1885
1886 Examples:
1887
1888 Sample as a list of pairs.
1889
1890 @c ===beg===
1891 @c load ("descriptive")$
1892 @c S: km([[2,1], [3,1], [5,0], [8,1]]);
1893 @c load ("draw")$
1894 @c draw2d(
1895 @c   line_width = 3, grid = true,
1896 @c   explicit(S, x, -0.1, 10))$
1897 @c ===end===
1898 @example
1899 (%i1) load ("descriptive")$
1900 @group
1901 (%i2) S: km([[2,1], [3,1], [5,0], [8,1]]);
1902                        charfun((3 <= x) and (x < 8))
1903 (%o2) charfun(x < 0) + -----------------------------
1904                                      2
1905    3 charfun((2 <= x) and (x < 3))
1906  + -------------------------------
1907                   4
1908  + charfun((0 <= x) and (x < 2))
1909 @end group
1910 (%i3) load ("draw")$
1911 @group
1912 (%i4) draw2d(
1913   line_width = 3, grid = true,
1914   explicit(S, x, -0.1, 10))$
1915 @end group
1916 @end example
1917
1918 Estimate survival probabilities.
1919
1920 @c ===beg===
1921 @c load ("descriptive")$
1922 @c S(t):= ''(km([[2,1], [3,1], [5,0], [8,1]], t)) $
1923 @c S(6);
1924 @c ===end===
1925 @example
1926 (%i1) load ("descriptive")$
1927 (%i2) S(t):= ''(km([[2,1], [3,1], [5,0], [8,1]], t)) $
1928 @group
1929 (%i3) S(6);
1930                                 1
1931 (%o3)                           -
1932                                 2
1933 @end group
1934 @end example
1935
1936 @opencatbox{Categories:}
1937 @category{Package descriptive}
1938 @closecatbox
1939 @end deffn
1940
1941
1942
1943 @anchor{cdf_empirical}
1944 @deffn {Function} cdf_empirical @
1945 @fname{cdf_empirical} (@var{list}, @var{option} ...) @
1946 @fname{cdf_empirical} (@var{matrix}, @var{option} ...)
1947
1948 Empirical distribution function @math{F(x)}.
1949
1950 Data can be introduced as a list of numbers, or as an one column matrix.
1951
1952 The optional argument is the name of the variable in the returned expression,
1953 which is @var{x} by default.
1954
1955 Example:
1956
1957 Empirical distribution function.
1958
1959 @c ===beg===
1960 @c load ("descriptive")$
1961 @c F(x):= ''(cdf_empirical([1,3,3,5,7,7,7,8,9]));
1962 @c F(6);
1963 @c load("draw")$
1964 @c draw2d(
1965 @c    line_width = 3,
1966 @c    grid       = true,
1967 @c    explicit(F(z), z, -2, 12)) $
1968 @c ===end===
1969 @example
1970 (%i1) load ("descriptive")$
1971 @group
1972 (%i2) F(x):= ''(cdf_empirical([1,3,3,5,7,7,7,8,9]));
1973 (%o2) F(x) := (charfun(x >= 9) + charfun(x >= 8)
1974  + 3 charfun(x >= 7) + charfun(x >= 5) + 2 charfun(x >= 3)
1975  + charfun(x >= 1))/9
1976 @end group
1977 @group
1978 (%i3) F(6);
1979                                 4
1980 (%o3)                           -
1981                                 9
1982 @end group
1983 (%i4) load("draw")$
1984 @group
1985 (%i5) draw2d(
1986    line_width = 3,
1987    grid       = true,
1988    explicit(F(z), z, -2, 12)) $
1989 @end group
1990 @end example
1991
1992 @opencatbox{Categories:}
1993 @category{Package descriptive}
1994 @closecatbox
1995 @end deffn
1996
1997
1998
1999 @anchor{cov}
2000 @deffn {Function} cov (@var{matrix})
2001 The covariance matrix of the multivariate sample, defined as
2002 @ifnottex
2003 @example
2004 @group
2005               n
2006              ====
2007           1  \           _        _
2008       S = -   >    (X  - X) (X  - X)'
2009           n  /       j        j
2010              ====
2011              j = 1
2012 @end group
2013 @end example
2014 @end ifnottex
2015 @tex
2016 $${S={1\over{n}}{\sum_{j=1}^{n}{\left(X_{j}-\bar{X}\right)\,\left(X_{j}-\bar{X}\right)'}}}$$
2017 @end tex
2018 where @math{X_j} is the @math{j}-th row of the sample matrix.
2019
2020 Example:
2021
2022 @c ===beg===
2023 @c load ("descriptive")$
2024 @c s2 : read_matrix (file_search ("wind.data"))$
2025 @c fpprintprec : 7$
2026 @c cov (s2);
2027 @c ===end===
2028 @example
2029 (%i1) load ("descriptive")$
2030 (%i2) s2 : read_matrix (file_search ("wind.data"))$
2031 (%i3) fpprintprec : 7$
2032 @group
2033 (%i4) cov (s2);
2034       [ 17.22191  13.61811  14.37217  19.39624  15.42162 ]
2035       [                                                  ]
2036       [ 13.61811  14.98774  13.30448  15.15834  14.9711  ]
2037       [                                                  ]
2038 (%o4) [ 14.37217  13.30448  15.47573  17.32544  16.18171 ]
2039       [                                                  ]
2040       [ 19.39624  15.15834  17.32544  32.17651  20.44685 ]
2041       [                                                  ]
2042       [ 15.42162  14.9711   16.18171  20.44685  24.42308 ]
2043 @end group
2044 @end example
2045
2046 See also function @mrefdot{cov1}
2047
2048 @opencatbox{Categories:}
2049 @category{Package descriptive}
2050 @closecatbox
2051 @end deffn
2052
2053
2054
2055 @anchor{cov1}
2056 @deffn {Function} cov1 (@var{matrix})
2057 The covariance matrix of the multivariate sample, defined as
2058 @ifnottex
2059 @example
2060 @group
2061               n
2062              ====
2063          1   \           _        _
2064    S  = ---   >    (X  - X) (X  - X)'
2065     1   n-1  /       j        j
2066              ====
2067              j = 1
2068 @end group
2069 @end example
2070 @end ifnottex
2071 @tex
2072 $${{1\over{n-1}}{\sum_{j=1}^{n}{\left(X_{j}-\bar{X}\right)\,\left(X_{j}-\bar{X}\right)'}}}$$
2073 @end tex
2074 where @math{X_j} is the @math{j}-th row of the sample matrix.
2075
2076 Example:
2077
2078 @c ===beg===
2079 @c load ("descriptive")$
2080 @c s2 : read_matrix (file_search ("wind.data"))$
2081 @c fpprintprec : 7$
2082 @c cov1 (s2);
2083 @c ===end===
2084 @example
2085 (%i1) load ("descriptive")$
2086 (%i2) s2 : read_matrix (file_search ("wind.data"))$
2087 (%i3) fpprintprec : 7$
2088 @group
2089 (%i4) cov1 (s2);
2090       [ 17.39587  13.75567  14.51734  19.59216  15.5774  ]
2091       [                                                  ]
2092       [ 13.75567  15.13913  13.43887  15.31145  15.12232 ]
2093       [                                                  ]
2094 (%o4) [ 14.51734  13.43887  15.63205  17.50044  16.34516 ]
2095       [                                                  ]
2096       [ 19.59216  15.31145  17.50044  32.50153  20.65338 ]
2097       [                                                  ]
2098       [ 15.5774   15.12232  16.34516  20.65338  24.66977 ]
2099 @end group
2100 @end example
2101
2102 See also function @mrefdot{cov}
2103
2104 @opencatbox{Categories:}
2105 @category{Package descriptive}
2106 @closecatbox
2107 @end deffn
2108
2109
2110
2111 @anchor{global_variances}
2112 @deffn {Function} global_variances @
2113 @fname{global_variances} (@var{matrix}) @
2114 @fname{global_variances} (@var{matrix}, @var{options} ...)
2115
2116 Function @code{global_variances} returns a list of global variance measures:
2117
2118 @itemize @bullet
2119 @item
2120 @var{total variance}: @code{trace(S_1)},
2121 @item
2122 @var{mean variance}: @code{trace(S_1)/p},
2123 @item
2124 @var{generalized variance}: @code{determinant(S_1)},
2125 @item
2126 @var{generalized standard deviation}: @code{sqrt(determinant(S_1))},
2127 @item
2128 @var{effective variance} @code{determinant(S_1)^(1/p)}, (defined in: Pe@~na, D. (2002) @var{An@'alisis de datos multivariantes}; McGraw-Hill, Madrid.)
2129 @item
2130 @var{effective standard deviation}: @code{determinant(S_1)^(1/(2*p))}.
2131 @end itemize
2132 where @var{p} is the dimension of the multivariate random variable and @math{S_1} the covariance matrix returned by @code{cov1}.
2133
2134 Option:
2135
2136 @itemize @bullet
2137 @item
2138 @code{'data}, default @code{'true}, indicates whether the input matrix contains the sample data,
2139 in which case the covariance matrix @code{cov1} must be calculated, or not, and then the covariance
2140 matrix (symmetric) must be given, instead of the data.
2141 @end itemize
2142
2143 Example:
2144
2145 @c ===beg===
2146 @c load ("descriptive")$
2147 @c s2 : read_matrix (file_search ("wind.data"))$
2148 @c global_variances (s2);
2149 @c ===end===
2150 @example
2151 (%i1) load ("descriptive")$
2152 (%i2) s2 : read_matrix (file_search ("wind.data"))$
2153 @group
2154 (%i3) global_variances (s2);
2155 (%o3) [105.33834206060595, 21.06766841212119, 12874.34690469686,
2156        113.46517926085015, 6.636590811800794, 2.5761581496097623]
2157 @end group
2158 @end example
2159
2160 Calculate the @code{global_variances} from the covariance matrix.
2161
2162 @c ===beg===
2163 @c load ("descriptive")$
2164 @c s2 : read_matrix (file_search ("wind.data"))$
2165 @c s : cov1 (s2)$
2166 @c global_variances (s, data=false);
2167 @c ===end===
2168 @example
2169 (%i1) load ("descriptive")$
2170 (%i2) s2 : read_matrix (file_search ("wind.data"))$
2171 (%i3) s : cov1 (s2)$
2172 @group
2173 (%i4) global_variances (s, data=false);
2174 (%o4) [105.33834206060595, 21.06766841212119, 12874.34690469686,
2175        113.46517926085015, 6.636590811800794, 2.5761581496097623]
2176 @end group
2177 @end example
2178
2179 See also @mref{cov} and @mrefdot{cov1}
2180
2181 @opencatbox{Categories:}
2182 @category{Package descriptive}
2183 @closecatbox
2184 @end deffn
2185
2186
2187
2188 @anchor{cor}
2189 @deffn {Function} cor @
2190 @fname{cor} (@var{matrix}) @
2191 @fname{cor} (@var{matrix}, @var{logical_value})
2192
2193 The correlation matrix of the multivariate sample.
2194
2195 Option:
2196
2197 @itemize @bullet
2198 @item
2199 @code{'data}, default @code{'true}, indicates whether the input matrix contains the sample data,
2200 in which case the covariance matrix @code{cov1} must be calculated, or not, and then the covariance
2201 matrix (symmetric) must be given, instead of the data.
2202 @end itemize
2203
2204 Example:
2205
2206 @c ===beg===
2207 @c load ("descriptive")$
2208 @c fpprintprec : 7 $
2209 @c s2 : read_matrix (file_search ("wind.data"))$
2210 @c cor (s2);
2211 @c ===end===
2212 @example
2213 (%i1) load ("descriptive")$
2214 (%i2) fpprintprec : 7 $
2215 (%i3) s2 : read_matrix (file_search ("wind.data"))$
2216 @group
2217 (%i4) cor (s2);
2218       [    1.0     0.8476339  0.8803515  0.8239624  0.7519506 ]
2219       [                                                       ]
2220       [ 0.8476339     1.0     0.8735834  0.6902622  0.782502  ]
2221       [                                                       ]
2222 (%o4) [ 0.8803515  0.8735834     1.0     0.7764065  0.8323358 ]
2223       [                                                       ]
2224       [ 0.8239624  0.6902622  0.7764065     1.0     0.7293848 ]
2225       [                                                       ]
2226       [ 0.7519506  0.782502   0.8323358  0.7293848     1.0    ]
2227 @end group
2228 @end example
2229
2230 Calculate de correlation matrix from the covariance matrix.
2231
2232 @c ===beg===
2233 @c load ("descriptive")$
2234 @c fpprintprec : 7 $
2235 @c s2 : read_matrix (file_search ("wind.data"))$
2236 @c s : cov1 (s2)$
2237 @c cor (s, data=false); /* this is faster */
2238 @c ===end===
2239 @example
2240 (%i1) load ("descriptive")$
2241 (%i2) fpprintprec : 7 $
2242 (%i3) s2 : read_matrix (file_search ("wind.data"))$
2243 (%i4) s : cov1 (s2)$
2244 @group
2245 (%i5) cor (s, data=false); /* this is faster */
2246       [    1.0     0.8476339  0.8803515  0.8239624  0.7519506 ]
2247       [                                                       ]
2248       [ 0.8476339     1.0     0.8735834  0.6902622  0.782502  ]
2249       [                                                       ]
2250 (%o5) [ 0.8803515  0.8735834     1.0     0.7764065  0.8323358 ]
2251       [                                                       ]
2252       [ 0.8239624  0.6902622  0.7764065     1.0     0.7293848 ]
2253       [                                                       ]
2254       [ 0.7519506  0.782502   0.8323358  0.7293848     1.0    ]
2255 @end group
2256 @end example
2257
2258 See also @mref{cov} and @mrefdot{cov1}
2259
2260 @opencatbox{Categories:}
2261 @category{Package descriptive}
2262 @closecatbox
2263 @end deffn
2264
2265
2266
2267 @anchor{list_correlations}
2268 @deffn {Function} list_correlations @
2269 @fname{list_correlations} (@var{matrix}) @
2270 @fname{list_correlations} (@var{matrix}, @var{options} ...)
2271
2272 Function @code{list_correlations} returns a list of correlation measures:
2273
2274 @itemize @bullet
2275
2276 @item
2277 @var{precision matrix}: the inverse of the covariance matrix @math{S_1},
2278 @ifnottex
2279 @example
2280 @group
2281        -1     ij
2282       S   = (s  )
2283        1         i,j = 1,2,...,p
2284 @end group
2285 @end example
2286 @end ifnottex
2287 @tex
2288 $${S_{1}^{-1}}={\left(s^{ij}\right)_{i,j=1,2,\ldots, p}}$$
2289 @end tex
2290
2291 @item
2292 @var{multiple correlation vector}:  @math{(R_1^2, R_2^2, ..., R_p^2)}, with
2293 @ifnottex
2294 @example
2295 @group
2296        2          1
2297       R  = 1 - -------
2298        i        ii
2299                s   s
2300                     ii
2301 @end group
2302 @end example
2303 @end ifnottex
2304 @tex
2305 $${R_{i}^{2}}={1-{{1}\over{s^{ii}s_{ii}}}}$$
2306 @end tex
2307 being an indicator of the goodness of fit of the linear multivariate regression model on @math{X_i} when the rest of variables are used as regressors.
2308
2309 @item
2310 @var{partial correlation matrix}: with element @math{(i, j)} being
2311 @ifnottex
2312 @example
2313 @group
2314                          ij
2315                         s
2316       r        = - ------------
2317        ij.rest     / ii  jj\ 1/2
2318                    |s   s  |
2319                    \       /
2320 @end group
2321 @end example
2322 @end ifnottex
2323 @tex
2324 $${r_{ij.rest}}={-{{s^{ij}}\over \sqrt{s^{ii}s^{jj}}}}$$
2325 @end tex
2326
2327 @end itemize
2328
2329 Option:
2330
2331 @itemize @bullet
2332 @item
2333 @code{'data}, default @code{'true}, indicates whether the input matrix contains the sample data,
2334 in which case the covariance matrix @code{cov1} must be calculated, or not, and then the covariance
2335 matrix (symmetric) must be given, instead of the data.
2336 @end itemize
2337
2338 Example:
2339
2340 @c ===beg===
2341 @c load ("descriptive")$
2342 @c s2 : read_matrix (file_search ("wind.data"))$
2343 @c z : list_correlations (s2)$
2344 @c fpprintprec : 5$
2345 @c precision_matrix: z[1];
2346 @c multiple_correlation_vector: z[2];
2347 @c partial_correlation_matrix: z[3];
2348 @c ===end===
2349 @example
2350 (%i1) load ("descriptive")$
2351 (%i2) s2 : read_matrix (file_search ("wind.data"))$
2352 (%i3) z : list_correlations (s2)$
2353 (%i4) fpprintprec : 5$
2354 @group
2355 (%i5) precision_matrix: z[1];
2356 (%o5)
2357     [  0.38486   - 0.13856   - 0.15626   - 0.10239    0.031179  ]
2358     [                                                           ]
2359     [ - 0.13856   0.34107    - 0.15233    0.038447   - 0.052842 ]
2360     [                                                           ]
2361     [ - 0.15626  - 0.15233    0.47296    - 0.024816  - 0.10054  ]
2362     [                                                           ]
2363     [ - 0.10239   0.038447   - 0.024816   0.10937    - 0.034033 ]
2364     [                                                           ]
2365     [ 0.031179   - 0.052842  - 0.10054   - 0.034033   0.14834   ]
2366 @end group
2367 @group
2368 (%i6) multiple_correlation_vector: z[2];
2369 (%o6)     [0.85063, 0.80634, 0.86474, 0.71867, 0.72675]
2370 @end group
2371 @group
2372 (%i7) partial_correlation_matrix: z[3];
2373       [   - 1.0     0.38244   0.36627   0.49908   - 0.13049 ]
2374       [                                                     ]
2375       [  0.38244     - 1.0    0.37927  - 0.19907   0.23492  ]
2376       [                                                     ]
2377 (%o7) [  0.36627    0.37927    - 1.0    0.10911    0.37956  ]
2378       [                                                     ]
2379       [  0.49908   - 0.19907  0.10911    - 1.0     0.26719  ]
2380       [                                                     ]
2381       [ - 0.13049   0.23492   0.37956   0.26719     - 1.0   ]
2382 @end group
2383 @end example
2384
2385 See also @mref{cov} and @mrefdot{cov1}
2386
2387 @opencatbox{Categories:}
2388 @category{Package descriptive}
2389 @closecatbox
2390 @end deffn
2391
2392
2393
2394
2395 @anchor{principal_components}
2396 @deffn {Function} principal_components @
2397 @fname{principal_components} (@var{matrix}) @
2398 @fname{principal_components} (@var{matrix}, @var{options} ...)
2399
2400 Calculates the principal components of a multivariate sample. Principal components are
2401 used in multivariate statistical analysis to reduce the dimensionality of the sample.
2402
2403 Option:
2404
2405 @itemize @bullet
2406 @item
2407 @code{'data}, default @code{'true}, indicates whether the input matrix contains the sample data,
2408 in which case the covariance matrix @mref{cov1} must be calculated, or not, and then the covariance
2409 matrix (symmetric) must be given, instead of the data.
2410 @end itemize
2411
2412 The output of function @code{principal_components} is a list with the following results:
2413
2414 @itemize @bullet
2415 @item
2416 variances of the principal components,
2417 @item
2418 percentage of total variance explained by each principal component,
2419 @item
2420 rotation matrix.
2421 @end itemize
2422
2423 Examples:
2424
2425 In this sample, the first component explains 83.13 per cent of total
2426 variance.
2427
2428 @example
2429 (%i1) load ("descriptive")$
2430 (%i2) s2 : read_matrix (file_search ("wind.data"))$
2431 (%i3) fpprintprec:4 $
2432 (%i4) res: principal_components(s2);
2433 0 errors, 0 warnings
2434 (%o4) [[87.57, 8.753, 5.515, 1.889, 1.613],
2435 [83.13, 8.31, 5.235, 1.793, 1.531],
2436 @group
2437 [ .4149  .03379   - .4757  - 0.581   - .5126 ]
2438 [                                            ]
2439 [ 0.369  - .3657  - .4298   .7237    - .1469 ]
2440 [                                            ]
2441 [ .3959  - .2178  - .2181  - .2749    .8201  ]]
2442 [                                            ]
2443 [ .5548   .7744    .1857    .2319    .06498  ]
2444 [                                            ]
2445 [ .4765  - .4669   0.712   - .09605  - .1969 ]
2446 @end group
2447 (%i5) /* accumulated percentages  */
2448     block([ap: copy(res[2])],
2449       for k:2 thru length(ap) do ap[k]: ap[k]+ap[k-1],
2450       ap);
2451 (%o5)                 [83.13, 91.44, 96.68, 98.47, 100.0]
2452 (%i6) /* sample dimension */
2453       p: length(first(res));
2454 (%o6)                                  5
2455 (%i7) /* plot percentages to select number of
2456          principal components for further work */
2457      draw2d(
2458         fill_density = 0.2,
2459         apply(bars, makelist([k, res[2][k], 1/2], k, p)),
2460         points_joined = true,
2461         point_type    = filled_circle,
2462         point_size    = 3,
2463         points(makelist([k, res[2][k]], k, p)),
2464         xlabel = "Variances",
2465         ylabel = "Percentages",
2466         xtics  = setify(makelist([concat("PC",k),k], k, p))) $
2467 @end example
2468
2469 In case de covariance matrix is known, it can be passed to the function,
2470 but option @code{data=false} must be used.
2471
2472 @example
2473 (%i1) load ("descriptive")$
2474 (%i2) S: matrix([1,-2,0],[-2,5,0],[0,0,2]);
2475                                 [  1   - 2  0 ]
2476                                 [             ]
2477 (%o2)                           [ - 2   5   0 ]
2478                                 [             ]
2479                                 [  0    0   2 ]
2480 (%i3) fpprintprec:4 $
2481 (%i4) /* the argument is a covariance matrix */
2482       res: principal_components(S, data=false);
2483 0 errors, 0 warnings
2484                                                   [ - .3827  0.0  .9239 ]
2485                                                   [                     ]
2486 (%o4) [[5.828, 2.0, .1716], [72.86, 25.0, 2.145], [  .9239   0.0  .3827 ]]
2487                                                   [                     ]
2488                                                   [   0.0    1.0   0.0  ]
2489 (%i5) /* transformation to get the principal components
2490          from original records */
2491       matrix([a1,b2,c3],[a2,b2,c2]).last(res);
2492              [ .9239 b2 - .3827 a1  1.0 c3  .3827 b2 + .9239 a1 ]
2493 (%o5)        [                                                  ]
2494              [ .9239 b2 - .3827 a2  1.0 c2  .3827 b2 + .9239 a2 ]
2495 @end example
2496
2497 @opencatbox{Categories:}
2498 @category{Package descriptive}
2499 @closecatbox
2500 @end deffn
2501
2502
2503
2504 @node Functions and Variables for statistical graphs,  , Functions and Variables for descriptive statistics, descriptive-pkg
2505 @section Functions and Variables for statistical graphs
2506
2507
2508
2509 @anchor{barsplot}
2510 @deffn {Function} barsplot (@var{data1}, @var{data2}, @dots{}, @var{option_1}, @var{option_2}, @dots{})
2511
2512 Plots bars diagrams for discrete statistical variables,
2513 both for one or multiple samples.
2514
2515 @var{data} can be a list of outcomes representing one sample, or a
2516 matrix of @var{m} rows and @var{n} columns, representing @var{n} samples of size
2517 @var{m} each.
2518
2519 Available options are:
2520
2521 @itemize @bullet
2522
2523 @item
2524 @var{box_width} (default, @code{3/4}): relative width of rectangles. This
2525 value must be in the range @code{[0,1]}.
2526
2527 @item
2528 @var{grouping} (default, @code{clustered}): indicates how multiple samples are
2529 shown. Valid values are: @code{clustered} and @code{stacked}.
2530
2531 @item
2532 @var{groups_gap} (default, @code{1}): a positive integer number representing
2533 the gap between two consecutive groups of bars.
2534
2535 @item
2536 @var{bars_colors} (default, @code{[]}): a list of colors for multiple samples.
2537 When there are more samples than specified colors, the extra necessary colors
2538 are chosen at random. See @code{color} to learn more about them.
2539
2540 @item
2541 @var{frequency} (default, @code{absolute}): indicates the scale of the
2542 ordinates. Possible values are:  @code{absolute}, @code{relative},
2543 and @code{percent}.
2544
2545 @item
2546 @var{ordering} (default, @code{orderlessp}): possible values are @code{orderlessp} or @code{ordergreatp},
2547 indicating how statistical outcomes should be ordered on the @var{x}-axis.
2548
2549 @item
2550 @var{sample_keys} (default, @code{[]}): a list with the strings to be used in the legend.
2551 When the list length is other than 0 or the number of samples, an error message is returned.
2552
2553 @item
2554 @var{start_at} (default, @code{0}): indicates where the plot begins to be plotted on the
2555 x axis.
2556
2557 @item
2558 All global @code{draw} options, except @code{xtics}, which is
2559 internally assigned by @code{barsplot}.
2560 If you want to set your own values for this option or want to build
2561 complex scenes, make use of @code{barsplot_description}. See example below.
2562
2563 @item
2564 The following local @ref{draw-pkg} options: @mrefcomma{key} @mrefcomma{color_draw}
2565 @mrefcomma{fill_color} @mref{fill_density} and @mrefdot{line_width}
2566 See also
2567 @mrefdot{barsplot}
2568
2569 @end itemize
2570
2571 There is also a function @code{wxbarsplot} for creating embedded
2572 histograms in interfaces wxMaxima and iMaxima.  @code{barsplot} in a
2573 multiplot context.
2574
2575 Examples:
2576
2577 Univariate sample in matrix form. Absolute frequencies.
2578
2579 @c ===beg===
2580 @c load ("descriptive")$
2581 @c m : read_matrix (file_search ("biomed.data"))$
2582 @c barsplot(
2583 @c   col(m,2),
2584 @c   title        = "Ages",
2585 @c   xlabel       = "years",
2586 @c   box_width    = 1/2,
2587 @c   fill_density = 3/4)$
2588 @c ===end===
2589 @example
2590 (%i1) load ("descriptive")$
2591 (%i2) m : read_matrix (file_search ("biomed.data"))$
2592 @group
2593 (%i3) barsplot(
2594   col(m,2),
2595   title        = "Ages",
2596   xlabel       = "years",
2597   box_width    = 1/2,
2598   fill_density = 3/4)$
2599 @end group
2600 @end example
2601
2602 Two samples of different sizes, with
2603 relative frequencies and user declared colors.
2604
2605 @c ===beg===
2606 @c load ("descriptive")$
2607 @c l1:makelist(random(10),k,1,50)$
2608 @c l2:makelist(random(10),k,1,100)$
2609 @c barsplot(
2610 @c    l1,l2,
2611 @c    box_width = 1,
2612 @c    fill_density = 1,
2613 @c    bars_colors = [black, grey],
2614 @c    frequency = relative,
2615 @c    sample_keys = ["A", "B"])$
2616 @c ===end===
2617 @example
2618 (%i1) load ("descriptive")$
2619 (%i2) l1:makelist(random(10),k,1,50)$
2620 (%i3) l2:makelist(random(10),k,1,100)$
2621 @group
2622 (%i4) barsplot(
2623    l1,l2,
2624    box_width = 1,
2625    fill_density = 1,
2626    bars_colors = [black, grey],
2627    frequency = relative,
2628    sample_keys = ["A", "B"])$
2629 @end group
2630 @end example
2631
2632 Four non numeric samples of equal size.
2633
2634 @c ===beg===
2635 @c load ("descriptive")$
2636 @c barsplot(
2637 @c   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2638 @c   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2639 @c   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2640 @c   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2641 @c   title      = "Asking for something to four groups",
2642 @c   ylabel     = "# of individuals",
2643 @c   groups_gap = 3,
2644 @c   fill_density = 0.5,
2645 @c   ordering = ordergreatp)$
2646 @c ===end===
2647 @example
2648 (%i1) load ("descriptive")$
2649 @group
2650 (%i2) barsplot(
2651   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2652   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2653   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2654   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2655   title      = "Asking for something to four groups",
2656   ylabel     = "# of individuals",
2657   groups_gap = 3,
2658   fill_density = 0.5,
2659   ordering = ordergreatp)$
2660 @end group
2661 @end example
2662
2663 Stacked bars.
2664
2665 @c ===beg===
2666 @c load ("descriptive")$
2667 @c barsplot(
2668 @c   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2669 @c   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2670 @c   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2671 @c   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2672 @c   title      = "Asking for something to four groups",
2673 @c   ylabel     = "# of individuals",
2674 @c   grouping   = stacked,
2675 @c   fill_density = 0.5,
2676 @c   ordering = ordergreatp)$
2677 @c ===end===
2678 @example
2679 (%i1) load ("descriptive")$
2680 @group
2681 (%i2) barsplot(
2682   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2683   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2684   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2685   makelist([Yes, No, Maybe][random(3)+1],k,1,50),
2686   title      = "Asking for something to four groups",
2687   ylabel     = "# of individuals",
2688   grouping   = stacked,
2689   fill_density = 0.5,
2690   ordering = ordergreatp)$
2691 @end group
2692 @end example
2693
2694 For bars diagrams related options, see @mref{barsplot} of package @ref{draw-pkg}
2695 See also functions @mref{histogram} and @mrefdot{piechart}
2696
2697 @opencatbox{Categories:}
2698 @category{Package descriptive}
2699 @category{Plotting}
2700 @closecatbox
2701 @end deffn
2702
2703 @anchor{barsplot_description}
2704 @deffn {Function} barsplot_description (@dots{})
2705
2706 Function @code{barsplot_description} creates a graphic object
2707 suitable for creating complex scenes, together with other
2708 graphic objects.
2709
2710 Example: @code{barsplot} in a multiplot context.
2711
2712 @example
2713 (%i1) load ("descriptive")$
2714 (%i2) l1:makelist(random(10),k,1,50)$
2715 (%i3) l2:makelist(random(10),k,1,100)$
2716 (%i4) bp1 :
2717         barsplot_description(
2718          l1,
2719          box_width = 1,
2720          fill_density = 0.5,
2721          bars_colors = [blue],
2722          frequency = relative)$
2723 (%i5) bp2 :
2724         barsplot_description(
2725          l2,
2726          box_width = 1,
2727          fill_density = 0.5,
2728          bars_colors = [red],
2729          frequency = relative)$
2730 (%i6) draw(gr2d(bp1), gr2d(bp2))$
2731 @end example
2732
2733 @opencatbox{Categories:}
2734 @category{Package descriptive}
2735 @category{Plotting}
2736 @closecatbox
2737 @end deffn
2738
2739 @anchor{boxplot}
2740 @deffn {Function} boxplot (@var{data}) @
2741 @fname{boxplot} (@var{data}, @var{option_1}, @var{option_2}, @dots{})
2742
2743 This function plots box-and-whisker diagrams. Argument @var{data} can be a list,
2744 which is not of great interest, since these diagrams are mainly used for
2745 comparing different samples, or a matrix, so it is possible to compare
2746 two or more components of a multivariate statistical variable.
2747 But it is also allowed @var{data} to be a list of samples with
2748 possible different sample sizes, in fact this is the only function
2749 in package @code{descriptive} that admits this type of data structure.
2750
2751 The box is plotted from the first quartile to the third, with an horizontal
2752 segment situated at the second quartile or median. By default, lower and
2753 upper whiskers are plotted at the minimum and maximum values,
2754 respectively. Option @var{range} can be used to indicate that values greater
2755 than @code{quantile(x,3/4)+range*(quantile(x,3/4)-quantile(x,1/4))} or
2756 less than @code{quantile(x,1/4)-range*(quantile(x,3/4)-quantile(x,1/4))}
2757 must be considered as outliers, in which case they are plotted as
2758 isolated points, and the whiskers are located at the extremes of the rest of
2759 the sample.
2760
2761 Available options are:
2762
2763 @itemize @bullet
2764
2765 @item
2766 @var{box_width} (default, @code{3/4}): relative width of boxes.
2767 This  value must be in the range @code{[0,1]}.
2768
2769 @item
2770 @var{box_orientation} (default, @code{vertical}): possible values: @code{vertical}
2771 and @code{horizontal}.
2772
2773 @item
2774 @var{range} (default, @code{inf}): positive coefficient of the interquartilic range
2775 to set outliers boundaries.
2776
2777 @item
2778 @var{outliers_size} (default, @code{1}): circle size for isolated outliers.
2779
2780 @item
2781 All @code{draw} options, except @code{points_joined}, @code{point_size}, @code{point_type},
2782 @code{xtics}, @code{ytics}, @code{xrange}, and @code{yrange}, which are
2783 internally assigned by @code{boxplot}.
2784 If you want to set your own values for this options or want to build
2785 complex scenes, make use of @code{boxplot_description}.
2786
2787 @item
2788 The following local @code{draw} options: @code{key}, @code{color},
2789 and @code{line_width}.
2790
2791 @end itemize
2792
2793 There is also a function @code{wxboxplot} for creating embedded
2794 histograms in interfaces wxMaxima and iMaxima.
2795
2796 Examples:
2797
2798 Box-and-whisker diagram from a multivariate sample.
2799
2800 @c ===beg===
2801 @c load ("descriptive")$
2802 @c s2 : read_matrix(file_search("wind.data"))$
2803 @c boxplot(s2,
2804 @c   box_width  = 0.2,
2805 @c   title      = "Windspeed in knots",
2806 @c   xlabel     = "Stations",
2807 @c   color      = red,
2808 @c   line_width = 2)$
2809 @c ===end===
2810 @example
2811 (%i1) load ("descriptive")$
2812 (%i2) s2 : read_matrix(file_search("wind.data"))$
2813 @group
2814 (%i3) boxplot(s2,
2815   box_width  = 0.2,
2816   title      = "Windspeed in knots",
2817   xlabel     = "Stations",
2818   color      = red,
2819   line_width = 2)$
2820 @end group
2821 @end example
2822
2823 Box-and-whisker diagram from three samples of different sizes.
2824
2825 @c ===beg===
2826 @c load ("descriptive")$
2827 @c A :
2828 @c  [[6, 4, 6, 2, 4, 8, 6, 4, 6, 4, 3, 2],
2829 @c   [8, 10, 7, 9, 12, 8, 10],
2830 @c   [16, 13, 17, 12, 11, 18, 13, 18, 14, 12]]$
2831 @c boxplot (A, box_orientation = horizontal)$
2832 @c ===end===
2833 @example
2834 (%i1) load ("descriptive")$
2835 @group
2836 (%i2) A :
2837  [[6, 4, 6, 2, 4, 8, 6, 4, 6, 4, 3, 2],
2838   [8, 10, 7, 9, 12, 8, 10],
2839   [16, 13, 17, 12, 11, 18, 13, 18, 14, 12]]$
2840 @end group
2841 (%i3) boxplot (A, box_orientation = horizontal)$
2842 @end example
2843
2844 Option @var{range} can be used to handle outliers.
2845
2846 @c ===beg===
2847 @c  load ("descriptive")$
2848 @c  B: [[7, 15, 5, 8, 6, 5, 7, 3, 1],
2849 @c      [10, 8, 12, 8, 11, 9, 20],
2850 @c      [23, 17, 19, 7, 22, 19]] $
2851 @c  boxplot (B, range=1)$
2852 @c  boxplot (B, range=1.5, box_orientation = horizontal)$
2853 @c  draw2d(
2854 @c     boxplot_description(
2855 @c        B,
2856 @c        range            = 1.5,
2857 @c        line_width       = 3,
2858 @c        outliers_size    = 2,
2859 @c        color            = red,
2860 @c        background_color = light_gray),
2861 @c     xtics = {["Low",1],["Medium",2],["High",3]}) $
2862 @c ===end===
2863 @example
2864 @group
2865 (%i1)  load ("descriptive")$
2866  B: [[7, 15, 5, 8, 6, 5, 7, 3, 1],
2867      [10, 8, 12, 8, 11, 9, 20],
2868      [23, 17, 19, 7, 22, 19]] $
2869  boxplot (B, range=1)$
2870  boxplot (B, range=1.5, box_orientation = horizontal)$
2871  draw2d(
2872     boxplot_description(
2873        B,
2874        range            = 1.5,
2875        line_width       = 3,
2876        outliers_size    = 2,
2877        color            = red,
2878        background_color = light_gray),
2879     xtics = @{["Low",1],["Medium",2],["High",3]@}) $
2880 @end group
2881 @end example
2882
2883 @opencatbox{Categories:}
2884 @category{Package descriptive}
2885 @category{Plotting}
2886 @closecatbox
2887 @end deffn
2888
2889 @anchor{boxplot_description}
2890 @deffn {Function} boxplot_description (@dots{})
2891
2892 Function @code{boxplot_description} creates a graphic object
2893 suitable for creating complex scenes, together with other
2894 graphic objects.
2895
2896 @opencatbox{Categories:}
2897 @category{Package descriptive}
2898 @category{Plotting}
2899 @closecatbox
2900 @end deffn
2901
2902 @anchor{histogram}
2903 @deffn {Function} histogram @
2904 @fname{histogram} (@var{list}) @
2905 @fname{histogram} (@var{list}, @var{option_1}, @var{option_2}, @dots{}) @
2906 @fname{histogram} (@var{one_column_matrix}) @
2907 @fname{histogram} (@var{one_column_matrix}, @var{option_1}, @var{option_2}, @dots{}) @
2908 @fname{histogram} (@var{one_row_matrix}) @
2909 @fname{histogram} (@var{one_row_matrix}, @var{option_1}, @var{option_2}, @dots{})
2910
2911 Constructs and displays a histogram from a data sample.
2912 Data must be stored as a list of numbers, or a matrix of one row or one column.
2913
2914 Optional arguments:
2915
2916 @itemize @bullet
2917
2918 @item
2919 @code{nclasses} (default, 10):
2920 the number of classes (also called bins) in the histogram,
2921 or a list of two numbers (the least and greatest values included in the histogram),
2922 or a list of three numbers (the least and greatest values included in the histogram, and the number of classes),
2923 or a set containing the endpoints of the class intervals,
2924 or a symbol specifying the name of one of three algorithms to automatically determine the number of classes:
2925 @code{fd} (Ref. [1]), @code{scott} (Ref. [2]), or @code{sturges} (Ref. [3]).
2926
2927 A class interval excludes its left endpoint and includes its right endpoint,
2928 except for the first interval, which includes both the left and right endpoints.
2929 It is assumed that class intervals are contiguous.
2930 That is, the right endpoint of one interval is equal to the left endpoint of the next.
2931
2932 @item
2933 @code{frequency} (default, @code{absolute}): indicates the scale of the vertical axis.
2934 Possible values are:  @code{absolute} (heights of bars add up to number of data),
2935 @code{relative} (heights of bars add up to 1),
2936 @code{percent} (heights of bars add up to 100),
2937 and @code{density} (total area of histogram is 1).
2938
2939 @item
2940 @code{htics} (default, @code{auto}): format of tic marks on the horizontal axis.
2941 Possible values are: @code{auto} (tics are placed automatically),
2942 @code{endpoints} (tics are placed at the divisions between classes),
2943 @code{intervals} (classes are labeled with the corresponding intervals),
2944 or a list of labels, one for each class.
2945
2946 @item
2947 All global @code{draw} options, except @code{xrange}, @code{yrange},
2948 and @code{xtics}, which are internally assigned by @code{histogram}.
2949 If you want to set your own values for these options, make use of
2950 @code{histogram_description}.
2951
2952 @item
2953 The following local @ref{draw-pkg} options: @mrefcomma{key}
2954 @mrefcomma{fill_color} @mrefcomma{fill_density} and @mrefdot{line_width}
2955 Note that the outlines of bars,
2956 as well as the interior of bars when @code{fill_density} is nonzero,
2957 are drawn with @code{fill_color}, not @code{color}.
2958
2959 @end itemize
2960
2961 @code{histogram} honors the global option @code{histogram_skyline}.
2962 When @code{histogram_skyline} is @code{true},
2963 @code{histogram} and @code{histogram_description} construct "skyline" plots,
2964 which shows the outline of the histogram bars,
2965 instead of drawing all the vertical segments.
2966 Otherwise (the default), histograms are displayed with bars showing vertical segments.
2967
2968 There is also a function @code{wxhistogram} for creating embedded
2969 histograms in interfaces wxMaxima and iMaxima.
2970
2971 See also @mrefcomma{continuous_freq}
2972 which, like @code{histogram},
2973 counts data in intervals,
2974 but returns the counts instead of displaying a graphic representation.
2975
2976 See also @mrefdot{barsplot}
2977
2978 Examples:
2979
2980 A simple histogram with eight classes:
2981
2982 @c ===beg===
2983 @c load ("descriptive")$
2984 @c s1 : read_list (file_search ("pidigits.data"))$
2985 @c histogram (
2986 @c      s1,
2987 @c      nclasses     = 8,
2988 @c      title        = "pi digits",
2989 @c      xlabel       = "digits",
2990 @c      ylabel       = "Absolute frequency",
2991 @c      fill_color   = grey,
2992 @c      fill_density = 0.6)$
2993 @c ===end===
2994 @example
2995 (%i1) load ("descriptive")$
2996 (%i2) s1 : read_list (file_search ("pidigits.data"))$
2997 @group
2998 (%i3) histogram (
2999      s1,
3000      nclasses     = 8,
3001      title        = "pi digits",
3002      xlabel       = "digits",
3003      ylabel       = "Absolute frequency",
3004      fill_color   = grey,
3005      fill_density = 0.6)$
3006 @end group
3007 @end example
3008
3009 Setting the limits of the histogram to -2 and 12, with 3 classes.
3010 Also, we introduce predefined tics:
3011
3012 @c ===beg===
3013 @c load ("descriptive")$
3014 @c s1 : read_list (file_search ("pidigits.data"))$
3015 @c histogram (
3016 @c      s1,
3017 @c      nclasses     = [-2,12,3],
3018 @c      htics        = ["A", "B", "C"],
3019 @c      terminal     = png,
3020 @c      fill_color   = "#23afa0",
3021 @c      fill_density = 0.6)$
3022 @c ===end===
3023 @example
3024 (%i1) load ("descriptive")$
3025 (%i2) s1 : read_list (file_search ("pidigits.data"))$
3026 @group
3027 (%i3) histogram (
3028      s1,
3029      nclasses     = [-2,12,3],
3030      htics        = ["A", "B", "C"],
3031      terminal     = png,
3032      fill_color   = "#23afa0",
3033      fill_density = 0.6)$
3034 @end group
3035 @end example
3036
3037 Bounds for varying class widths.
3038
3039 @c ===beg===
3040 @c load ("descriptive")$
3041 @c s1 : read_list (file_search ("pidigits.data"))$
3042 @c histogram (s1, nclasses = {0,3,6,7,11})$
3043 @c ===end===
3044 @example
3045 (%i1) load ("descriptive")$
3046 (%i2) s1 : read_list (file_search ("pidigits.data"))$
3047 (%i3) histogram (s1, nclasses = @{0,3,6,7,11@})$
3048 @end example
3049
3050 Freedman-Diaconis formula for the number of classes.
3051
3052 @c ===beg===
3053 @c load ("descriptive")$
3054 @c s1 : read_list (file_search ("pidigits.data"))$
3055 @c histogram(s1, nclasses=fd) $
3056 @c ===end===
3057 @example
3058 (%i1) load ("descriptive")$
3059 (%i2) s1 : read_list (file_search ("pidigits.data"))$
3060 (%i3) histogram(s1, nclasses=fd) $
3061 @end example
3062
3063 References:
3064
3065 [1] Freedman, D., and Diaconis, P. (1981) On the histogram as a density estimator: L_2 theory.
3066 Zeitschrift f@"ur Wahrscheinlichkeitstheorie und verwandte Gebiete 57, 453-476.
3067
3068 [2] Scott, D. W. (1979) On optimal and data-based histograms. Biometrika 66, 605-610.
3069
3070 [3] Sturges, H. A. (1926) The choice of a class interval. Journal of the American Statistical Association 21, 65-66.
3071
3072 @opencatbox{Categories:}
3073 @category{Package descriptive}
3074 @category{Plotting}
3075 @closecatbox
3076 @end deffn
3077
3078 @anchor{histogram_description}
3079 @deffn {Function} histogram_description (@dots{})
3080
3081 Creates a graphic object which represents a histogram.
3082 Such an object is suitable for creating complex scenes together with other graphic objects,
3083 to be displayed by @code{draw2d}.
3084
3085 @code{histogram_description} takes the same arguments
3086 as the stand-alone function @code{histogram}.
3087 See @mref{histogram} for more information.
3088
3089 Example:
3090
3091 We make use of @code{histogram_description} for setting
3092 @code{xrange} and adding an explicit curve into the scene:
3093
3094 @example
3095 (%i1) load ("descriptive")$
3096 (%i2) ( load("distrib"),
3097         m: 14, s: 2,
3098         s2: random_normal(m, s, 1000) ) $
3099 (%i3) draw2d(
3100         grid   = true,
3101         xrange = [5, 25],
3102         histogram_description(
3103           s2,
3104           nclasses     = 9,
3105           frequency    = density,
3106           fill_density = 0.5),
3107         explicit(pdf_normal(x,m,s), x, m - 3*s, m + 3* s))$
3108 @end example
3109
3110 @opencatbox{Categories:}
3111 @category{Package descriptive}
3112 @category{Plotting}
3113 @closecatbox
3114 @end deffn
3115
3116 @anchor{histogram_skyline}
3117 @defvr {Option variable} histogram_skyline
3118 Default value: @code{false}
3119
3120 When @code{histogram_skyline} is @code{true},
3121 @code{histogram} and @code{histogram_description} construct "skyline" plots,
3122 which shows the outline of the histogram bars,
3123 instead of drawing all the vertical segments.
3124
3125 The outline is drawn with the current @code{fill_color} (not the current @code{color}).
3126 The interior of the histogram is filled with @code{fill_color},
3127 but only if @code{fill_density} is nonzero.
3128
3129 Otherwise, histograms are displayed with bars showing vertical segments.
3130
3131 Example:
3132
3133 Construct a skyline histogram,
3134 and an ordinary histogram for comparison,
3135 on the same plot.
3136
3137 @example
3138 (%i1) load ("descriptive") $
3139 (%i2) L: read_list (file_search ("pidigits.data")) $
3140 (%i3) histogram_skyline: true $
3141 (%i4) skyline_hist: histogram_description (L) $
3142 (%i5) histogram_skyline: false $
3143 (%i6) ordinary_hist: histogram_description (L) $
3144 (%i7) draw (gr2d (skyline_hist), gr2d (ordinary_hist)) $
3145 @end example
3146
3147 Continuing the preceding example.
3148 Set display options for @code{fill_color} and @code{fill_density}.
3149
3150 @example
3151 (%i8) histogram_skyline: true $
3152 (%i9) skyline_hist: histogram_description (L, fill_color = blue, fill_density = 0.2) $
3153 (%i10) histogram_skyline: false $
3154 (%i11) ordinary_hist: histogram_description (L, fill_color = blue, fill_density = 0.2) $
3155 (%i12) draw (gr2d (skyline_hist), gr2d (ordinary_hist)) $
3156 @end example
3157
3158 @opencatbox{Categories:}
3159 @category{Package descriptive}
3160 @category{Plotting}
3161 @closecatbox
3162 @end defvr
3163
3164 @anchor{piechart}
3165 @deffn {Function} piechart @
3166 @fname{piechart} (@var{list}) @
3167 @fname{piechart} (@var{list}, @var{option_1}, @var{option_2}, @dots{}) @
3168 @fname{piechart} (@var{one_column_matrix}) @
3169 @fname{piechart} (@var{one_column_matrix}, @var{option_1}, @var{option_2}, @dots{}) @
3170 @fname{piechart} (@var{one_row_matrix}) @
3171 @fname{piechart} (@var{one_row_matrix}, @var{option_1}, @var{option_2}, @dots{})
3172
3173 Similar to @code{barsplot}, but plots sectors instead of rectangles.
3174
3175 Available options are:
3176
3177 @itemize @bullet
3178
3179 @item
3180 @var{sector_colors} (default, @code{[]}): a list of colors for sectors.
3181 When there are more sectors than specified colors, the extra necessary colors
3182 are chosen at random. See @code{color} to learn more about them.
3183
3184 @item
3185 @var{pie_center} (default, @code{[0,0]}): diagram's center.
3186
3187 @item
3188 @var{pie_radius} (default, @code{1}): diagram's radius.
3189
3190 @item
3191 All global @code{draw} options, except @code{key}, which is
3192 internally assigned by @code{piechart}.
3193 If you want to set your own values for this option or want to build
3194 complex scenes, make use of @code{piechart_description}.
3195
3196 @item
3197 The following local @code{draw} options: @code{key}, @code{color},
3198 @code{fill_density} and @code{line_width}. See also
3199 @code{ellipse}
3200
3201 @end itemize
3202
3203 There is also a function @code{wxpiechart} for
3204 creating embedded histograms in interfaces wxMaxima and iMaxima.
3205
3206 Example:
3207
3208 @c ===beg===
3209 @c load ("descriptive")$
3210 @c s1 : read_list (file_search ("pidigits.data"))$
3211 @c piechart(
3212 @c   s1,
3213 @c   xrange = [-1.1, 1.3],
3214 @c   yrange = [-1.1, 1.1],
3215 @c   title  = "Digit frequencies in pi")$
3216 @c ===end===
3217 @example
3218 (%i1) load ("descriptive")$
3219 (%i2) s1 : read_list (file_search ("pidigits.data"))$
3220 @group
3221 (%i3) piechart(
3222   s1,
3223   xrange = [-1.1, 1.3],
3224   yrange = [-1.1, 1.1],
3225   title  = "Digit frequencies in pi")$
3226 @end group
3227 @end example
3228
3229 See also function @mrefdot{barsplot}
3230
3231 @opencatbox{Categories:}
3232 @category{Package descriptive}
3233 @category{Plotting}
3234 @closecatbox
3235 @end deffn
3236
3237 @anchor{piechart_description}
3238 @deffn {Function} piechart_description (@dots{})
3239
3240 Function @code{piechart_description} creates a graphic object
3241 suitable for creating complex scenes, together with other
3242 graphic objects.
3243
3244 @opencatbox{Categories:}
3245 @category{Package descriptive}
3246 @category{Plotting}
3247 @closecatbox
3248 @end deffn
3249
3250 @anchor{scatterplot}
3251 @deffn {Function} scatterplot @
3252 @fname{scatterplot} (@var{list}) @
3253 @fname{scatterplot} (@var{list}, @var{option_1}, @var{option_2}, @dots{}) @
3254 @fname{scatterplot} (@var{matrix}) @
3255 @fname{scatterplot} (@var{matrix}, @var{option_1}, @var{option_2}, @dots{})
3256
3257 Plots scatter diagrams both for univariate (@var{list}) and multivariate
3258 (@var{matrix}) samples.
3259
3260 Available options are the same admitted by @code{histogram}.
3261
3262 There is also a function @code{wxscatterplot} for
3263 creating embedded histograms in interfaces wxMaxima and iMaxima.
3264
3265 Examples:
3266
3267 Univariate scatter diagram from a simulated Gaussian sample.
3268
3269 @c ===beg===
3270 @c load ("descriptive")$
3271 @c load ("distrib")$
3272 @c scatterplot(
3273 @c   random_normal(0,1,200),
3274 @c   xaxis      = true,
3275 @c   point_size = 2,
3276 @c   dimensions = [600,150])$
3277 @c ===end===
3278 @example
3279 (%i1) load ("descriptive")$
3280 (%i2) load ("distrib")$
3281 @group
3282 (%i3) scatterplot(
3283   random_normal(0,1,200),
3284   xaxis      = true,
3285   point_size = 2,
3286   dimensions = [600,150])$
3287 @end group
3288 @end example
3289
3290 Two dimensional scatter plot.
3291
3292 @c ===beg===
3293 @c load ("descriptive")$
3294 @c s2 : read_matrix (file_search ("wind.data"))$
3295 @c scatterplot(
3296 @c  submatrix(s2, 1,2,3),
3297 @c  title      = "Data from stations #4 and #5",
3298 @c  point_type = diamant,
3299 @c  point_size = 2,
3300 @c  color      = blue)$
3301 @c ===end===
3302 @example
3303 (%i1) load ("descriptive")$
3304 (%i2) s2 : read_matrix (file_search ("wind.data"))$
3305 @group
3306 (%i3) scatterplot(
3307  submatrix(s2, 1,2,3),
3308  title      = "Data from stations #4 and #5",
3309  point_type = diamant,
3310  point_size = 2,
3311  color      = blue)$
3312 @end group
3313 @end example
3314
3315 Three dimensional scatter plot.
3316
3317 @c ===beg===
3318 @c load ("descriptive")$
3319 @c s2 : read_matrix (file_search ("wind.data"))$
3320 @c scatterplot(submatrix (s2, 1,2), nclasses=4)$
3321 @c ===end===
3322 @example
3323 (%i1) load ("descriptive")$
3324 (%i2) s2 : read_matrix (file_search ("wind.data"))$
3325 (%i3) scatterplot(submatrix (s2, 1,2), nclasses=4)$
3326 @end example
3327
3328 Five dimensional scatter plot, with five classes histograms.
3329
3330 @c ===beg===
3331 @c load ("descriptive")$
3332 @c s2 : read_matrix (file_search ("wind.data"))$
3333 @c scatterplot(
3334 @c   s2,
3335 @c   nclasses     = 5,
3336 @c   frequency    = relative,
3337 @c   fill_color   = blue,
3338 @c   fill_density = 0.3,
3339 @c   xtics        = 5)$
3340 @c ===end===
3341 @example
3342 (%i1) load ("descriptive")$
3343 (%i2) s2 : read_matrix (file_search ("wind.data"))$
3344 @group
3345 (%i3) scatterplot(
3346   s2,
3347   nclasses     = 5,
3348   frequency    = relative,
3349   fill_color   = blue,
3350   fill_density = 0.3,
3351   xtics        = 5)$
3352 @end group
3353 @end example
3354
3355 For plotting isolated or line-joined points in two and three dimensions,
3356 see @code{points}. See also @mrefdot{histogram}
3357
3358 @opencatbox{Categories:}
3359 @category{Package descriptive}
3360 @category{Plotting}
3361 @closecatbox
3362 @end deffn
3363
3364 @anchor{scatterplot_description}
3365 @deffn {Function} scatterplot_description (@dots{})
3366
3367 Function @code{scatterplot_description} creates a graphic object
3368 suitable for creating complex scenes, together with other
3369 graphic objects.
3370
3371 @opencatbox{Categories:}
3372 @category{Package descriptive}
3373 @category{Plotting}
3374 @closecatbox
3375 @end deffn
3376
3377 @anchor{starplot}
3378 @deffn {Function} starplot (@var{data1}, @var{data2}, @dots{}, @var{option_1}, @var{option_2}, @dots{})
3379
3380 Plots star diagrams for discrete statistical variables,
3381 both for one or multiple samples.
3382
3383 @var{data} can be a list of outcomes representing one sample, or a
3384 matrix of @var{m} rows and @var{n} columns, representing @var{n} samples of size
3385 @var{m} each.
3386
3387 Available options are:
3388
3389 @itemize @bullet
3390
3391 @item
3392 @var{stars_colors} (default, @code{[]}): a list of colors for multiple samples.
3393 When there are more samples than specified colors, the extra necessary colors
3394 are chosen at random. See @code{color} to learn more about them.
3395
3396 @item
3397 @var{frequency} (default, @code{absolute}): indicates the scale of the
3398 radii. Possible values are:  @code{absolute} and @code{relative}.
3399
3400 @item
3401 @var{ordering} (default, @code{orderlessp}): possible values are @code{orderlessp} or @code{ordergreatp},
3402 indicating how statistical outcomes should be ordered.
3403
3404 @item
3405 @var{sample_keys} (default, @code{[]}): a list with the strings to be used in the legend.
3406 When the list length is other than 0 or the number of samples, an error message is returned.
3407
3408
3409 @item
3410 @var{star_center} (default, @code{[0,0]}): diagram's center.
3411
3412 @item
3413 @var{star_radius} (default, @code{1}): diagram's radius.
3414
3415 @item
3416 All global @code{draw} options, except @code{points_joined}, @code{point_type},
3417 and @code{key}, which are internally assigned by @code{starplot}.
3418 If you want to set your own values for this options or want to build
3419 complex scenes, make use of @code{starplot_description}.
3420
3421 @item
3422 The following local @code{draw} option: @code{line_width}.
3423
3424 @end itemize
3425
3426 There is also a function @code{wxstarplot} for
3427 creating embedded histograms in interfaces wxMaxima and iMaxima.
3428
3429 Example:
3430
3431 Plot based on absolute frequencies.
3432 Location and radius defined by the user.
3433
3434 @example
3435 (%i1) load ("descriptive")$
3436 (%i2) l1: makelist(random(10),k,1,50)$
3437 (%i3) l2: makelist(random(10),k,1,200)$
3438 @group
3439 (%i4) starplot(
3440         l1, l2,
3441         stars_colors = [blue,red],
3442         sample_keys = ["1st sample", "2nd sample"],
3443         star_center = [1,2],
3444         star_radius = 4,
3445         proportional_axes = xy,
3446         line_width = 2 ) $
3447 @end group
3448 @end example
3449
3450 @opencatbox{Categories:}
3451 @category{Package descriptive}
3452 @category{Plotting}
3453 @closecatbox
3454 @end deffn
3455
3456 @anchor{starplot_description}
3457 @deffn {Function} starplot_description (@dots{})
3458
3459 Function @code{starplot_description} creates a graphic object
3460 suitable for creating complex scenes, together with other
3461 graphic objects.
3462
3463 @opencatbox{Categories:}
3464 @category{Package descriptive}
3465 @category{Plotting}
3466 @closecatbox
3467 @end deffn
3468
3469 @anchor{stemplot}
3470 @deffn {Function} stemplot @
3471 @fname{stemplot} (@var{data}) @
3472 @fname{stemplot} (@var{data}, @var{option})
3473
3474 Plots stem and leaf diagrams.
3475
3476 Unique available option is:
3477
3478 @itemize @bullet
3479
3480 @item
3481 @var{leaf_unit} (default, @code{1}): indicates the unit of the leaves; must be a
3482 power of 10.
3483
3484 @end itemize
3485
3486 Example:
3487
3488 @example
3489 (%i1) load ("descriptive")$
3490 (%i2) load("distrib")$
3491 @group
3492 (%i3) stemplot(
3493         random_normal(15, 6, 100),
3494         leaf_unit = 0.1);
3495 -5|4
3496  0|37
3497  1|7
3498  3|6
3499  4|4
3500  5|4
3501  6|57
3502  7|0149
3503  8|3
3504  9|1334588
3505 10|07888
3506 11|01144467789
3507 12|12566889
3508 13|24778
3509 14|047
3510 15|223458
3511 16|4
3512 17|11557
3513 18|000247
3514 19|4467799
3515 20|00
3516 21|1
3517 22|2335
3518 23|01457
3519 24|12356
3520 25|455
3521 27|79
3522 key: 6|3 =  6.3
3523 (%o3)                  done
3524 @end group
3525 @end example
3526
3527 @opencatbox{Categories:}
3528 @category{Package descriptive}
3529 @category{Plotting}
3530 @closecatbox
3531 @end deffn
3532