Merge Chromium + Blink git repositories
[chromium-blink-merge.git] / tools / telemetry / third_party / gsutilz / gslib / addlhelp / metadata.py
blobf0b547fd1bf441dcf0e4c48f44009dc4b6765810
1 # -*- coding: utf-8 -*-
2 # Copyright 2012 Google Inc. All Rights Reserved.
4 # Licensed under the Apache License, Version 2.0 (the "License");
5 # you may not use this file except in compliance with the License.
6 # You may obtain a copy of the License at
8 # http://www.apache.org/licenses/LICENSE-2.0
10 # Unless required by applicable law or agreed to in writing, software
11 # distributed under the License is distributed on an "AS IS" BASIS,
12 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13 # See the License for the specific language governing permissions and
14 # limitations under the License.
15 """Additional help about object metadata."""
17 from __future__ import absolute_import
19 from gslib.help_provider import HelpProvider
21 _DETAILED_HELP_TEXT = ("""
22 <B>OVERVIEW OF METADATA</B>
23 Objects can have associated metadata, which control aspects of how
24 GET requests are handled, including Content-Type, Cache-Control,
25 Content-Disposition, and Content-Encoding (discussed in more detail in
26 the subsections below). In addition, you can set custom metadata that
27 can be used by applications (e.g., tagging that particular objects possess
28 some property).
30 There are two ways to set metadata on objects:
32 - at upload time you can specify one or more headers to associate with
33 objects, using the gsutil -h option. For example, the following command
34 would cause gsutil to set the Content-Type and Cache-Control for each
35 of the files being uploaded:
37 gsutil -h "Content-Type:text/html" \\
38 -h "Cache-Control:public, max-age=3600" cp -r images \\
39 gs://bucket/images
41 Note that -h is an option on the gsutil command, not the cp sub-command.
43 - You can set or remove metadata fields from already uploaded objects using
44 the gsutil setmeta command. See "gsutil help setmeta".
46 More details about specific pieces of metadata are discussed below.
49 <B>CONTENT TYPE</B>
50 The most commonly set metadata is Content-Type (also known as MIME type),
51 which allows browsers to render the object properly.
52 gsutil sets the Content-Type automatically at upload time, based on each
53 filename extension. For example, uploading files with names ending in .txt
54 will set Content-Type to text/plain. If you're running gsutil on Linux or
55 MacOS and would prefer to have content type set based on naming plus content
56 examination, see the use_magicfile configuration variable in the gsutil/boto
57 configuration file (See also "gsutil help config"). In general, using
58 use_magicfile is more robust and configurable, but is not available on
59 Windows.
61 If you specify a Content-Type header with -h when uploading content (like the
62 example gsutil command given in the previous section), it overrides the
63 Content-Type that would have been set based on filename extension or content.
64 This can be useful if the Content-Type detection algorithm doesn't work as
65 desired for some of your files.
67 You can also completely suppress content type detection in gsutil, by
68 specifying an empty string on the Content-Type header:
70 gsutil -h 'Content-Type:' cp -r images gs://bucket/images
72 In this case, the Google Cloud Storage service will not attempt to detect
73 the content type. In general this approach will work better than using
74 filename extension-based content detection in gsutil, because the list of
75 filename extensions is kept more current in the server-side content detection
76 system than in the Python library upon which gsutil content type detection
77 depends. (For example, at the time of writing this, the filename extension
78 ".webp" was recognized by the server-side content detection system, but
79 not by gsutil.)
82 <B>CACHE-CONTROL</B>
83 Another commonly set piece of metadata is Cache-Control, which allows
84 you to control whether and for how long browser and Internet caches are
85 allowed to cache your objects. Cache-Control only applies to objects with
86 a public-read ACL. Non-public data are not cacheable.
88 Here's an example of uploading an object set to allow caching:
90 gsutil -h "Cache-Control:public,max-age=3600" cp -a public-read \\
91 -r html gs://bucket/html
93 This command would upload all files in the html directory (and subdirectories)
94 and make them publicly readable and cacheable, with cache expiration of
95 one hour.
97 Note that if you allow caching, at download time you may see older versions
98 of objects after uploading a newer replacement object. Note also that because
99 objects can be cached at various places on the Internet there is no way to
100 force a cached object to expire globally (unlike the way you can force your
101 browser to refresh its cache).
103 Another use of the Cache-Control header is through the "no-transform" value,
104 which instructs Google Cloud Storage to not apply any content transformations
105 based on specifics of a download request, such as removing gzip
106 content-encoding for incompatible clients. Note that this parameter is only
107 respected by the XML API. The Google Cloud Storage JSON API respects only the
108 no-cache and max-age Cache-Control parameters.
110 Note that if you upload an object with a public-read ACL and don't include a
111 Cache-Control header, it will be served with a Cache-Control header allowing
112 the object to be cached for 3600 seconds. This will not happen if the object
113 is uploaded with a non-public ACL and then changed to public. Moreover, if you
114 upload an object with a public-read ACL and later change the ACL not to be
115 public-read, the object will no longer be served with the default
116 Cache-Control header noted above (so will be served as not cacheable).
118 For details about how to set the Cache-Control header see
119 "gsutil help setmeta".
122 <B>CONTENT-ENCODING</B>
123 You can specify a Content-Encoding to indicate that an object is compressed
124 (for example, with gzip compression) while maintaining its Content-Type.
125 You will need to ensure that the files have been compressed using the
126 specified Content-Encoding before using gsutil to upload them. Consider the
127 following example for Linux:
129 echo "Highly compressible text" | gzip > foo.txt
130 gsutil -h "Content-Encoding:gzip" -h "Content-Type:text/plain" \\
131 cp foo.txt gs://bucket/compressed
133 Note that this is different from uploading a gzipped object foo.txt.gz with
134 Content-Type: application/x-gzip because most browsers are able to
135 dynamically decompress and process objects served with Content-Encoding: gzip
136 based on the underlying Content-Type.
138 For compressible content, using Content-Encoding: gzip saves network and
139 storage costs, and improves content serving performance. However, for content
140 that is already inherently compressed (archives and many media formats, for
141 instance) applying another level of compression via Content-Encoding is
142 typically detrimental to both object size and performance and should be
143 avoided.
145 Note also that gsutil provides an easy way to cause content to be compressed
146 and stored with Content-Encoding: gzip: see the -z option in "gsutil help cp".
149 <B>CONTENT-DISPOSITION</B>
150 You can set Content-Disposition on your objects, to specify presentation
151 information about the data being transmitted. Here's an example:
153 gsutil -h 'Content-Disposition:attachment; filename=filename.ext' \\
154 cp -r attachments gs://bucket/attachments
156 Setting the Content-Disposition allows you to control presentation style
157 of the content, for example determining whether an attachment should be
158 automatically displayed vs should require some form of action from the user to
159 open it. See http://www.w3.org/Protocols/rfc2616/rfc2616-sec19.html#sec19.5.1
160 for more details about the meaning of Content-Disposition.
163 <B>CUSTOM METADATA</B>
164 You can add your own custom metadata (e.g,. for use by your application)
165 to an object by setting a header that starts with "x-goog-meta", for example:
167 gsutil -h x-goog-meta-reviewer:jane cp mycode.java gs://bucket/reviews
169 You can add multiple differently named custom metadata fields to each object.
172 <B>SETTABLE FIELDS; FIELD VALUES</B>
173 You can't set some metadata fields, such as ETag and Content-Length. The
174 fields you can set are:
176 - Cache-Control
177 - Content-Disposition
178 - Content-Encoding
179 - Content-Language
180 - Content-MD5
181 - Content-Type
182 - Any field starting with a matching Cloud Storage Provider
183 prefix, such as x-goog-meta- (i.e., custom metadata).
185 Header names are case-insensitive.
187 x-goog-meta- fields can have data set to arbitrary Unicode values. All
188 other fields must have ASCII values.
191 <B>VIEWING CURRENTLY SET METADATA</B>
192 You can see what metadata is currently set on an object by using:
194 gsutil ls -L gs://the_bucket/the_object
195 """)
198 class CommandOptions(HelpProvider):
199 """Additional help about object metadata."""
201 # Help specification. See help_provider.py for documentation.
202 help_spec = HelpProvider.HelpSpec(
203 help_name='metadata',
204 help_name_aliases=[
205 'cache-control', 'caching', 'content type', 'mime type', 'mime',
206 'type'],
207 help_type='additional_help',
208 help_one_line_summary='Working With Object Metadata',
209 help_text=_DETAILED_HELP_TEXT,
210 subcommand_help_text={},