1 // Copyright 2014 The Chromium Authors. All rights reserved.
2 // Use of this source code is governed by a BSD-style license that can be
3 // found in the LICENSE file.
5 //------------------------------------------------------------------------------
6 // Description of the life cycle of a instance of MetricsService.
10 // A MetricsService instance is typically created at application startup. It is
11 // the central controller for the acquisition of log data, and the automatic
12 // transmission of that log data to an external server. Its major job is to
13 // manage logs, grouping them for transmission, and transmitting them. As part
14 // of its grouping, MS finalizes logs by including some just-in-time gathered
15 // memory statistics, snapshotting the current stats of numerous histograms,
16 // closing the logs, translating to protocol buffer format, and compressing the
17 // results for transmission. Transmission includes submitting a compressed log
18 // as data in a URL-post, and retransmitting (or retaining at process
19 // termination) if the attempted transmission failed. Retention across process
20 // terminations is done using the the PrefServices facilities. The retained logs
21 // (the ones that never got transmitted) are compressed and base64-encoded
22 // before being persisted.
24 // Logs fall into one of two categories: "initial logs," and "ongoing logs."
25 // There is at most one initial log sent for each complete run of Chrome (from
26 // startup, to browser shutdown). An initial log is generally transmitted some
27 // short time (1 minute?) after startup, and includes stats such as recent crash
28 // info, the number and types of plugins, etc. The external server's response
29 // to the initial log conceptually tells this MS if it should continue
30 // transmitting logs (during this session). The server response can actually be
31 // much more detailed, and always includes (at a minimum) how often additional
32 // ongoing logs should be sent.
34 // After the above initial log, a series of ongoing logs will be transmitted.
35 // The first ongoing log actually begins to accumulate information stating when
36 // the MS was first constructed. Note that even though the initial log is
37 // commonly sent a full minute after startup, the initial log does not include
38 // much in the way of user stats. The most common interlog period (delay)
39 // is 30 minutes. That time period starts when the first user action causes a
40 // logging event. This means that if there is no user action, there may be long
41 // periods without any (ongoing) log transmissions. Ongoing logs typically
42 // contain very detailed records of user activities (ex: opened tab, closed
43 // tab, fetched URL, maximized window, etc.) In addition, just before an
44 // ongoing log is closed out, a call is made to gather memory statistics. Those
45 // memory statistics are deposited into a histogram, and the log finalization
46 // code is then called. In the finalization, a call to a Histogram server
47 // acquires a list of all local histograms that have been flagged for upload
48 // to the UMA server. The finalization also acquires the most recent number
49 // of page loads, along with any counts of renderer or plugin crashes.
51 // When the browser shuts down, there will typically be a fragment of an ongoing
52 // log that has not yet been transmitted. At shutdown time, that fragment is
53 // closed (including snapshotting histograms), and persisted, for potential
54 // transmission during a future run of the product.
56 // There are two slightly abnormal shutdown conditions. There is a
57 // "disconnected scenario," and a "really fast startup and shutdown" scenario.
58 // In the "never connected" situation, the user has (during the running of the
59 // process) never established an internet connection. As a result, attempts to
60 // transmit the initial log have failed, and a lot(?) of data has accumulated in
61 // the ongoing log (which didn't yet get closed, because there was never even a
62 // contemplation of sending it). There is also a kindred "lost connection"
63 // situation, where a loss of connection prevented an ongoing log from being
64 // transmitted, and a (still open) log was stuck accumulating a lot(?) of data,
65 // while the earlier log retried its transmission. In both of these
66 // disconnected situations, two logs need to be, and are, persistently stored
67 // for future transmission.
69 // The other unusual shutdown condition, termed "really fast startup and
70 // shutdown," involves the deliberate user termination of the process before
71 // the initial log is even formed or transmitted. In that situation, no logging
72 // is done, but the historical crash statistics remain (unlogged) for inclusion
73 // in a future run's initial log. (i.e., we don't lose crash stats).
75 // With the above overview, we can now describe the state machine's various
76 // states, based on the State enum specified in the state_ member. Those states
79 // INITIALIZED, // Constructor was called.
80 // INIT_TASK_SCHEDULED, // Waiting for deferred init tasks to finish.
81 // INIT_TASK_DONE, // Waiting for timer to send initial log.
82 // SENDING_INITIAL_STABILITY_LOG, // Initial stability log being sent.
83 // SENDING_INITIAL_METRICS_LOG, // Initial metrics log being sent.
84 // SENDING_OLD_LOGS, // Sending unsent logs from previous session.
85 // SENDING_CURRENT_LOGS, // Sending ongoing logs as they acrue.
87 // In more detail, we have:
89 // INITIALIZED, // Constructor was called.
90 // The MS has been constructed, but has taken no actions to compose the
93 // INIT_TASK_SCHEDULED, // Waiting for deferred init tasks to finish.
94 // Typically about 30 seconds after startup, a task is sent to a second thread
95 // (the file thread) to perform deferred (lower priority and slower)
96 // initialization steps such as getting the list of plugins. That task will
97 // (when complete) make an async callback (via a Task) to indicate the
100 // INIT_TASK_DONE, // Waiting for timer to send initial log.
101 // The callback has arrived, and it is now possible for an initial log to be
102 // created. This callback typically arrives back less than one second after
103 // the deferred init task is dispatched.
105 // SENDING_INITIAL_STABILITY_LOG, // Initial stability log being sent.
106 // During initialization, if a crash occurred during the previous session, an
107 // initial stability log will be generated and registered with the log manager.
108 // This state will be entered if a stability log was prepared during metrics
109 // service initialization (in InitializeMetricsRecordingState()) and is waiting
110 // to be transmitted when it's time to send up the first log (per the reporting
111 // scheduler). If there is no initial stability log (e.g. there was no previous
112 // crash), then this state will be skipped and the state will advance to
113 // SENDING_INITIAL_METRICS_LOG.
115 // SENDING_INITIAL_METRICS_LOG, // Initial metrics log being sent.
116 // This state is entered after the initial metrics log has been composed, and
117 // prepared for transmission. This happens after SENDING_INITIAL_STABILITY_LOG
118 // if there was an initial stability log (see above). It is also the case that
119 // any previously unsent logs have been loaded into instance variables for
120 // possible transmission.
122 // SENDING_OLD_LOGS, // Sending unsent logs from previous session.
123 // This state indicates that the initial log for this session has been
124 // successfully sent and it is now time to send any logs that were
125 // saved from previous sessions. All such logs will be transmitted before
126 // exiting this state, and proceeding with ongoing logs from the current session
129 // SENDING_CURRENT_LOGS, // Sending standard current logs as they accrue.
130 // Current logs are being accumulated. Typically every 20 minutes a log is
131 // closed and finalized for transmission, at the same time as a new log is
134 // The progression through the above states is simple, and sequential, in the
135 // most common use cases. States proceed from INITIAL to SENDING_CURRENT_LOGS,
136 // and remain in the latter until shutdown.
138 // The one unusual case is when the user asks that we stop logging. When that
139 // happens, any staged (transmission in progress) log is persisted, and any log
140 // that is currently accumulating is also finalized and persisted. We then
141 // regress back to the SEND_OLD_LOGS state in case the user enables log
142 // recording again during this session. This way anything we have persisted
143 // will be sent automatically if/when we progress back to SENDING_CURRENT_LOG
146 // Another similar case is on mobile, when the application is backgrounded and
147 // then foregrounded again. Backgrounding created new "old" stored logs, so the
148 // state drops back from SENDING_CURRENT_LOGS to SENDING_OLD_LOGS so those logs
151 // Also note that whenever we successfully send an old log, we mirror the list
152 // of logs into the PrefService. This ensures that IF we crash, we won't start
153 // up and retransmit our old logs again.
155 // Due to race conditions, it is always possible that a log file could be sent
156 // twice. For example, if a log file is sent, but not yet acknowledged by
157 // the external server, and the user shuts down, then a copy of the log may be
158 // saved for re-transmission. These duplicates could be filtered out server
159 // side, but are not expected to be a significant problem.
162 //------------------------------------------------------------------------------
164 #include "components/metrics/metrics_service.h"
168 #include "base/bind.h"
169 #include "base/callback.h"
170 #include "base/metrics/histogram.h"
171 #include "base/metrics/histogram_base.h"
172 #include "base/metrics/histogram_samples.h"
173 #include "base/metrics/sparse_histogram.h"
174 #include "base/metrics/statistics_recorder.h"
175 #include "base/prefs/pref_registry_simple.h"
176 #include "base/prefs/pref_service.h"
177 #include "base/strings/string_number_conversions.h"
178 #include "base/strings/utf_string_conversions.h"
179 #include "base/threading/platform_thread.h"
180 #include "base/threading/thread.h"
181 #include "base/threading/thread_restrictions.h"
182 #include "base/time/time.h"
183 #include "base/tracked_objects.h"
184 #include "base/values.h"
185 #include "components/metrics/metrics_log.h"
186 #include "components/metrics/metrics_log_manager.h"
187 #include "components/metrics/metrics_log_uploader.h"
188 #include "components/metrics/metrics_pref_names.h"
189 #include "components/metrics/metrics_reporting_scheduler.h"
190 #include "components/metrics/metrics_service_client.h"
191 #include "components/metrics/metrics_state_manager.h"
192 #include "components/variations/entropy_provider.h"
198 // Check to see that we're being called on only one thread.
199 bool IsSingleThreaded() {
200 static base::PlatformThreadId thread_id
= 0;
202 thread_id
= base::PlatformThread::CurrentId();
203 return base::PlatformThread::CurrentId() == thread_id
;
206 // The delay, in seconds, after starting recording before doing expensive
207 // initialization work.
208 #if defined(OS_ANDROID) || defined(OS_IOS)
209 // On mobile devices, a significant portion of sessions last less than a minute.
210 // Use a shorter timer on these platforms to avoid losing data.
211 // TODO(dfalcantara): To avoid delaying startup, tighten up initialization so
212 // that it occurs after the user gets their initial page.
213 const int kInitializationDelaySeconds
= 5;
215 const int kInitializationDelaySeconds
= 30;
218 // The maximum number of events in a log uploaded to the UMA server.
219 const int kEventLimit
= 2400;
221 // If an upload fails, and the transmission was over this byte count, then we
222 // will discard the log, and not try to retransmit it. We also don't persist
223 // the log to the prefs for transmission during the next chrome session if this
224 // limit is exceeded.
225 const size_t kUploadLogAvoidRetransmitSize
= 100 * 1024;
227 // Interval, in minutes, between state saves.
228 const int kSaveStateIntervalMinutes
= 5;
230 // The metrics server's URL.
231 const char kServerUrl
[] = "https://clients4.google.com/uma/v2";
233 // The MIME type for the uploaded metrics data.
234 const char kMimeType
[] = "application/vnd.chrome.uma";
236 enum ResponseStatus
{
239 BAD_REQUEST
, // Invalid syntax or log too large.
241 NUM_RESPONSE_STATUSES
244 ResponseStatus
ResponseCodeToStatus(int response_code
) {
245 switch (response_code
) {
253 return UNKNOWN_FAILURE
;
257 bool NewInitialMetricsTimingEnabled() {
258 return base::FieldTrialList::FindFullName("UMAInitialMetricsTiming") ==
262 void MarkAppCleanShutdownAndCommit(CleanExitBeacon
* clean_exit_beacon
,
263 PrefService
* local_state
) {
264 clean_exit_beacon
->WriteBeaconValue(true);
265 local_state
->SetInteger(prefs::kStabilityExecutionPhase
,
266 MetricsService::SHUTDOWN_COMPLETE
);
267 // Start writing right away (write happens on a different thread).
268 local_state
->CommitPendingWrite();
274 SyntheticTrialGroup::SyntheticTrialGroup(uint32 trial
, uint32 group
) {
279 SyntheticTrialGroup::~SyntheticTrialGroup() {
283 MetricsService::ShutdownCleanliness
MetricsService::clean_shutdown_status_
=
284 MetricsService::CLEANLY_SHUTDOWN
;
286 MetricsService::ExecutionPhase
MetricsService::execution_phase_
=
287 MetricsService::UNINITIALIZED_PHASE
;
290 void MetricsService::RegisterPrefs(PrefRegistrySimple
* registry
) {
291 DCHECK(IsSingleThreaded());
292 MetricsStateManager::RegisterPrefs(registry
);
293 MetricsLog::RegisterPrefs(registry
);
295 registry
->RegisterInt64Pref(prefs::kInstallDate
, 0);
297 registry
->RegisterInt64Pref(prefs::kStabilityLaunchTimeSec
, 0);
298 registry
->RegisterInt64Pref(prefs::kStabilityLastTimestampSec
, 0);
299 registry
->RegisterStringPref(prefs::kStabilityStatsVersion
, std::string());
300 registry
->RegisterInt64Pref(prefs::kStabilityStatsBuildTime
, 0);
301 registry
->RegisterBooleanPref(prefs::kStabilityExitedCleanly
, true);
302 registry
->RegisterIntegerPref(prefs::kStabilityExecutionPhase
,
303 UNINITIALIZED_PHASE
);
304 registry
->RegisterBooleanPref(prefs::kStabilitySessionEndCompleted
, true);
305 registry
->RegisterIntegerPref(prefs::kMetricsSessionID
, -1);
307 registry
->RegisterListPref(prefs::kMetricsInitialLogs
);
308 registry
->RegisterListPref(prefs::kMetricsOngoingLogs
);
310 registry
->RegisterInt64Pref(prefs::kUninstallLaunchCount
, 0);
311 registry
->RegisterInt64Pref(prefs::kUninstallMetricsUptimeSec
, 0);
314 MetricsService::MetricsService(MetricsStateManager
* state_manager
,
315 MetricsServiceClient
* client
,
316 PrefService
* local_state
)
317 : log_manager_(local_state
, kUploadLogAvoidRetransmitSize
),
318 histogram_snapshot_manager_(this),
319 state_manager_(state_manager
),
321 local_state_(local_state
),
322 clean_exit_beacon_(client
->GetRegistryBackupKey(), local_state
),
323 recording_active_(false),
324 reporting_active_(false),
325 test_mode_active_(false),
327 has_initial_stability_log_(false),
328 log_upload_in_progress_(false),
329 idle_since_last_transmission_(false),
331 self_ptr_factory_(this),
332 state_saver_factory_(this) {
333 DCHECK(IsSingleThreaded());
334 DCHECK(state_manager_
);
336 DCHECK(local_state_
);
338 // Set the install date if this is our first run.
339 int64 install_date
= local_state_
->GetInt64(prefs::kInstallDate
);
340 if (install_date
== 0)
341 local_state_
->SetInt64(prefs::kInstallDate
, base::Time::Now().ToTimeT());
344 MetricsService::~MetricsService() {
348 void MetricsService::InitializeMetricsRecordingState() {
349 InitializeMetricsState();
351 base::Closure callback
= base::Bind(&MetricsService::StartScheduledUpload
,
352 self_ptr_factory_
.GetWeakPtr());
353 scheduler_
.reset(new MetricsReportingScheduler(callback
));
356 void MetricsService::Start() {
357 HandleIdleSinceLastTransmission(false);
362 bool MetricsService::StartIfMetricsReportingEnabled() {
363 const bool enabled
= state_manager_
->IsMetricsReportingEnabled();
369 void MetricsService::StartRecordingForTests() {
370 test_mode_active_
= true;
375 void MetricsService::Stop() {
376 HandleIdleSinceLastTransmission(false);
381 void MetricsService::EnableReporting() {
382 if (reporting_active_
)
384 reporting_active_
= true;
385 StartSchedulerIfNecessary();
388 void MetricsService::DisableReporting() {
389 reporting_active_
= false;
392 std::string
MetricsService::GetClientId() {
393 return state_manager_
->client_id();
396 int64
MetricsService::GetInstallDate() {
397 return local_state_
->GetInt64(prefs::kInstallDate
);
400 scoped_ptr
<const base::FieldTrial::EntropyProvider
>
401 MetricsService::CreateEntropyProvider() {
402 // TODO(asvitkine): Refactor the code so that MetricsService does not expose
404 return state_manager_
->CreateEntropyProvider();
407 void MetricsService::EnableRecording() {
408 DCHECK(IsSingleThreaded());
410 if (recording_active_
)
412 recording_active_
= true;
414 state_manager_
->ForceClientIdCreation();
415 client_
->SetMetricsClientId(state_manager_
->client_id());
416 if (!log_manager_
.current_log())
419 for (size_t i
= 0; i
< metrics_providers_
.size(); ++i
)
420 metrics_providers_
[i
]->OnRecordingEnabled();
422 base::RemoveActionCallback(action_callback_
);
423 action_callback_
= base::Bind(&MetricsService::OnUserAction
,
424 base::Unretained(this));
425 base::AddActionCallback(action_callback_
);
428 void MetricsService::DisableRecording() {
429 DCHECK(IsSingleThreaded());
431 if (!recording_active_
)
433 recording_active_
= false;
435 base::RemoveActionCallback(action_callback_
);
437 for (size_t i
= 0; i
< metrics_providers_
.size(); ++i
)
438 metrics_providers_
[i
]->OnRecordingDisabled();
440 PushPendingLogsToPersistentStorage();
443 bool MetricsService::recording_active() const {
444 DCHECK(IsSingleThreaded());
445 return recording_active_
;
448 bool MetricsService::reporting_active() const {
449 DCHECK(IsSingleThreaded());
450 return reporting_active_
;
453 void MetricsService::RecordDelta(const base::HistogramBase
& histogram
,
454 const base::HistogramSamples
& snapshot
) {
455 log_manager_
.current_log()->RecordHistogramDelta(histogram
.histogram_name(),
459 void MetricsService::InconsistencyDetected(
460 base::HistogramBase::Inconsistency problem
) {
461 UMA_HISTOGRAM_ENUMERATION("Histogram.InconsistenciesBrowser",
462 problem
, base::HistogramBase::NEVER_EXCEEDED_VALUE
);
465 void MetricsService::UniqueInconsistencyDetected(
466 base::HistogramBase::Inconsistency problem
) {
467 UMA_HISTOGRAM_ENUMERATION("Histogram.InconsistenciesBrowserUnique",
468 problem
, base::HistogramBase::NEVER_EXCEEDED_VALUE
);
471 void MetricsService::InconsistencyDetectedInLoggedCount(int amount
) {
472 UMA_HISTOGRAM_COUNTS("Histogram.InconsistentSnapshotBrowser",
476 void MetricsService::HandleIdleSinceLastTransmission(bool in_idle
) {
477 // If there wasn't a lot of action, maybe the computer was asleep, in which
478 // case, the log transmissions should have stopped. Here we start them up
480 if (!in_idle
&& idle_since_last_transmission_
)
481 StartSchedulerIfNecessary();
482 idle_since_last_transmission_
= in_idle
;
485 void MetricsService::OnApplicationNotIdle() {
486 if (recording_active_
)
487 HandleIdleSinceLastTransmission(false);
490 void MetricsService::RecordStartOfSessionEnd() {
492 RecordBooleanPrefValue(prefs::kStabilitySessionEndCompleted
, false);
495 void MetricsService::RecordCompletedSessionEnd() {
497 RecordBooleanPrefValue(prefs::kStabilitySessionEndCompleted
, true);
500 #if defined(OS_ANDROID) || defined(OS_IOS)
501 void MetricsService::OnAppEnterBackground() {
504 MarkAppCleanShutdownAndCommit(&clean_exit_beacon_
, local_state_
);
506 // At this point, there's no way of knowing when the process will be
507 // killed, so this has to be treated similar to a shutdown, closing and
508 // persisting all logs. Unlinke a shutdown, the state is primed to be ready
509 // to continue logging and uploading if the process does return.
510 if (recording_active() && state_
>= SENDING_INITIAL_STABILITY_LOG
) {
511 PushPendingLogsToPersistentStorage();
512 // Persisting logs closes the current log, so start recording a new log
513 // immediately to capture any background work that might be done before the
514 // process is killed.
519 void MetricsService::OnAppEnterForeground() {
520 clean_exit_beacon_
.WriteBeaconValue(false);
521 StartSchedulerIfNecessary();
524 void MetricsService::LogNeedForCleanShutdown() {
525 clean_exit_beacon_
.WriteBeaconValue(false);
526 // Redundant setting to be sure we call for a clean shutdown.
527 clean_shutdown_status_
= NEED_TO_SHUTDOWN
;
529 #endif // defined(OS_ANDROID) || defined(OS_IOS)
532 void MetricsService::SetExecutionPhase(ExecutionPhase execution_phase
,
533 PrefService
* local_state
) {
534 execution_phase_
= execution_phase
;
535 local_state
->SetInteger(prefs::kStabilityExecutionPhase
, execution_phase_
);
538 void MetricsService::RecordBreakpadRegistration(bool success
) {
540 IncrementPrefValue(prefs::kStabilityBreakpadRegistrationFail
);
542 IncrementPrefValue(prefs::kStabilityBreakpadRegistrationSuccess
);
545 void MetricsService::RecordBreakpadHasDebugger(bool has_debugger
) {
547 IncrementPrefValue(prefs::kStabilityDebuggerNotPresent
);
549 IncrementPrefValue(prefs::kStabilityDebuggerPresent
);
552 void MetricsService::ClearSavedStabilityMetrics() {
553 for (size_t i
= 0; i
< metrics_providers_
.size(); ++i
)
554 metrics_providers_
[i
]->ClearSavedStabilityMetrics();
556 // Reset the prefs that are managed by MetricsService/MetricsLog directly.
557 local_state_
->SetInteger(prefs::kStabilityCrashCount
, 0);
558 local_state_
->SetInteger(prefs::kStabilityExecutionPhase
,
559 UNINITIALIZED_PHASE
);
560 local_state_
->SetInteger(prefs::kStabilityIncompleteSessionEndCount
, 0);
561 local_state_
->SetInteger(prefs::kStabilityLaunchCount
, 0);
562 local_state_
->SetBoolean(prefs::kStabilitySessionEndCompleted
, true);
565 //------------------------------------------------------------------------------
567 //------------------------------------------------------------------------------
570 //------------------------------------------------------------------------------
571 // Initialization methods
573 void MetricsService::InitializeMetricsState() {
574 const int64 buildtime
= MetricsLog::GetBuildTime();
575 const std::string version
= client_
->GetVersionString();
576 bool version_changed
= false;
577 if (local_state_
->GetInt64(prefs::kStabilityStatsBuildTime
) != buildtime
||
578 local_state_
->GetString(prefs::kStabilityStatsVersion
) != version
) {
579 local_state_
->SetString(prefs::kStabilityStatsVersion
, version
);
580 local_state_
->SetInt64(prefs::kStabilityStatsBuildTime
, buildtime
);
581 version_changed
= true;
584 log_manager_
.LoadPersistedUnsentLogs();
586 session_id_
= local_state_
->GetInteger(prefs::kMetricsSessionID
);
588 if (!clean_exit_beacon_
.exited_cleanly()) {
589 IncrementPrefValue(prefs::kStabilityCrashCount
);
590 // Reset flag, and wait until we call LogNeedForCleanShutdown() before
592 clean_exit_beacon_
.WriteBeaconValue(true);
595 if (!clean_exit_beacon_
.exited_cleanly() || ProvidersHaveStabilityMetrics()) {
596 // TODO(rtenneti): On windows, consider saving/getting execution_phase from
598 int execution_phase
=
599 local_state_
->GetInteger(prefs::kStabilityExecutionPhase
);
600 UMA_HISTOGRAM_SPARSE_SLOWLY("Chrome.Browser.CrashedExecutionPhase",
603 // If the previous session didn't exit cleanly, or if any provider
604 // explicitly requests it, prepare an initial stability log -
605 // provided UMA is enabled.
606 if (state_manager_
->IsMetricsReportingEnabled())
607 PrepareInitialStabilityLog();
610 // If no initial stability log was generated and there was a version upgrade,
611 // clear the stability stats from the previous version (so that they don't get
612 // attributed to the current version). This could otherwise happen due to a
613 // number of different edge cases, such as if the last version crashed before
614 // it could save off a system profile or if UMA reporting is disabled (which
615 // normally results in stats being accumulated).
616 if (!has_initial_stability_log_
&& version_changed
)
617 ClearSavedStabilityMetrics();
619 // Update session ID.
621 local_state_
->SetInteger(prefs::kMetricsSessionID
, session_id_
);
623 // Stability bookkeeping
624 IncrementPrefValue(prefs::kStabilityLaunchCount
);
626 DCHECK_EQ(UNINITIALIZED_PHASE
, execution_phase_
);
627 SetExecutionPhase(START_METRICS_RECORDING
, local_state_
);
629 if (!local_state_
->GetBoolean(prefs::kStabilitySessionEndCompleted
)) {
630 IncrementPrefValue(prefs::kStabilityIncompleteSessionEndCount
);
631 // This is marked false when we get a WM_ENDSESSION.
632 local_state_
->SetBoolean(prefs::kStabilitySessionEndCompleted
, true);
635 // Call GetUptimes() for the first time, thus allowing all later calls
636 // to record incremental uptimes accurately.
637 base::TimeDelta ignored_uptime_parameter
;
638 base::TimeDelta startup_uptime
;
639 GetUptimes(local_state_
, &startup_uptime
, &ignored_uptime_parameter
);
640 DCHECK_EQ(0, startup_uptime
.InMicroseconds());
641 // For backwards compatibility, leave this intact in case Omaha is checking
642 // them. prefs::kStabilityLastTimestampSec may also be useless now.
643 // TODO(jar): Delete these if they have no uses.
644 local_state_
->SetInt64(prefs::kStabilityLaunchTimeSec
,
645 base::Time::Now().ToTimeT());
647 // Bookkeeping for the uninstall metrics.
648 IncrementLongPrefsValue(prefs::kUninstallLaunchCount
);
650 // Kick off the process of saving the state (so the uptime numbers keep
651 // getting updated) every n minutes.
652 ScheduleNextStateSave();
655 void MetricsService::OnUserAction(const std::string
& action
) {
656 if (!ShouldLogEvents())
659 log_manager_
.current_log()->RecordUserAction(action
);
660 HandleIdleSinceLastTransmission(false);
663 void MetricsService::FinishedGatheringInitialMetrics() {
664 DCHECK_EQ(INIT_TASK_SCHEDULED
, state_
);
665 state_
= INIT_TASK_DONE
;
667 // Create the initial log.
668 if (!initial_metrics_log_
.get()) {
669 initial_metrics_log_
= CreateLog(MetricsLog::ONGOING_LOG
);
670 NotifyOnDidCreateMetricsLog();
673 scheduler_
->InitTaskComplete();
676 void MetricsService::GetUptimes(PrefService
* pref
,
677 base::TimeDelta
* incremental_uptime
,
678 base::TimeDelta
* uptime
) {
679 base::TimeTicks now
= base::TimeTicks::Now();
680 // If this is the first call, init |first_updated_time_| and
681 // |last_updated_time_|.
682 if (last_updated_time_
.is_null()) {
683 first_updated_time_
= now
;
684 last_updated_time_
= now
;
686 *incremental_uptime
= now
- last_updated_time_
;
687 *uptime
= now
- first_updated_time_
;
688 last_updated_time_
= now
;
690 const int64 incremental_time_secs
= incremental_uptime
->InSeconds();
691 if (incremental_time_secs
> 0) {
692 int64 metrics_uptime
= pref
->GetInt64(prefs::kUninstallMetricsUptimeSec
);
693 metrics_uptime
+= incremental_time_secs
;
694 pref
->SetInt64(prefs::kUninstallMetricsUptimeSec
, metrics_uptime
);
698 void MetricsService::NotifyOnDidCreateMetricsLog() {
699 DCHECK(IsSingleThreaded());
700 for (size_t i
= 0; i
< metrics_providers_
.size(); ++i
)
701 metrics_providers_
[i
]->OnDidCreateMetricsLog();
704 //------------------------------------------------------------------------------
705 // State save methods
707 void MetricsService::ScheduleNextStateSave() {
708 state_saver_factory_
.InvalidateWeakPtrs();
710 base::MessageLoop::current()->PostDelayedTask(FROM_HERE
,
711 base::Bind(&MetricsService::SaveLocalState
,
712 state_saver_factory_
.GetWeakPtr()),
713 base::TimeDelta::FromMinutes(kSaveStateIntervalMinutes
));
716 void MetricsService::SaveLocalState() {
717 RecordCurrentState(local_state_
);
719 // TODO(jar):110021 Does this run down the batteries????
720 ScheduleNextStateSave();
724 //------------------------------------------------------------------------------
725 // Recording control methods
727 void MetricsService::OpenNewLog() {
728 DCHECK(!log_manager_
.current_log());
730 log_manager_
.BeginLoggingWithLog(CreateLog(MetricsLog::ONGOING_LOG
));
731 NotifyOnDidCreateMetricsLog();
732 if (state_
== INITIALIZED
) {
733 // We only need to schedule that run once.
734 state_
= INIT_TASK_SCHEDULED
;
736 base::MessageLoop::current()->PostDelayedTask(
738 base::Bind(&MetricsService::StartGatheringMetrics
,
739 self_ptr_factory_
.GetWeakPtr()),
740 base::TimeDelta::FromSeconds(kInitializationDelaySeconds
));
744 void MetricsService::StartGatheringMetrics() {
745 client_
->StartGatheringMetrics(
746 base::Bind(&MetricsService::FinishedGatheringInitialMetrics
,
747 self_ptr_factory_
.GetWeakPtr()));
750 void MetricsService::CloseCurrentLog() {
751 if (!log_manager_
.current_log())
754 // TODO(jar): Integrate bounds on log recording more consistently, so that we
755 // can stop recording logs that are too big much sooner.
756 if (log_manager_
.current_log()->num_events() > kEventLimit
) {
757 UMA_HISTOGRAM_COUNTS("UMA.Discarded Log Events",
758 log_manager_
.current_log()->num_events());
759 log_manager_
.DiscardCurrentLog();
760 OpenNewLog(); // Start trivial log to hold our histograms.
763 // Put incremental data (histogram deltas, and realtime stats deltas) at the
764 // end of all log transmissions (initial log handles this separately).
765 // RecordIncrementalStabilityElements only exists on the derived
767 MetricsLog
* current_log
= log_manager_
.current_log();
769 RecordCurrentEnvironment(current_log
);
770 base::TimeDelta incremental_uptime
;
771 base::TimeDelta uptime
;
772 GetUptimes(local_state_
, &incremental_uptime
, &uptime
);
773 current_log
->RecordStabilityMetrics(metrics_providers_
.get(),
774 incremental_uptime
, uptime
);
776 current_log
->RecordGeneralMetrics(metrics_providers_
.get());
777 RecordCurrentHistograms();
779 log_manager_
.FinishCurrentLog();
782 void MetricsService::PushPendingLogsToPersistentStorage() {
783 if (state_
< SENDING_INITIAL_STABILITY_LOG
)
784 return; // We didn't and still don't have time to get plugin list etc.
787 log_manager_
.PersistUnsentLogs();
789 // If there was a staged and/or current log, then there is now at least one
790 // log waiting to be uploaded.
791 if (log_manager_
.has_unsent_logs())
792 state_
= SENDING_OLD_LOGS
;
795 //------------------------------------------------------------------------------
796 // Transmission of logs methods
798 void MetricsService::StartSchedulerIfNecessary() {
799 // Never schedule cutting or uploading of logs in test mode.
800 if (test_mode_active_
)
803 // Even if reporting is disabled, the scheduler is needed to trigger the
804 // creation of the initial log, which must be done in order for any logs to be
805 // persisted on shutdown or backgrounding.
806 if (recording_active() &&
807 (reporting_active() || state_
< SENDING_INITIAL_STABILITY_LOG
)) {
812 void MetricsService::StartScheduledUpload() {
813 // If we're getting no notifications, then the log won't have much in it, and
814 // it's possible the computer is about to go to sleep, so don't upload and
815 // stop the scheduler.
816 // If recording has been turned off, the scheduler doesn't need to run.
817 // If reporting is off, proceed if the initial log hasn't been created, since
818 // that has to happen in order for logs to be cut and stored when persisting.
819 // TODO(stuartmorgan): Call Stop() on the scheduler when reporting and/or
820 // recording are turned off instead of letting it fire and then aborting.
821 if (idle_since_last_transmission_
||
822 !recording_active() ||
823 (!reporting_active() && state_
>= SENDING_INITIAL_STABILITY_LOG
)) {
825 scheduler_
->UploadCancelled();
829 // If the callback was to upload an old log, but there no longer is one,
830 // just report success back to the scheduler to begin the ongoing log
832 // TODO(stuartmorgan): Consider removing the distinction between
833 // SENDING_OLD_LOGS and SENDING_CURRENT_LOGS to simplify the state machine
834 // now that the log upload flow is the same for both modes.
835 if (state_
== SENDING_OLD_LOGS
&& !log_manager_
.has_unsent_logs()) {
836 state_
= SENDING_CURRENT_LOGS
;
837 scheduler_
->UploadFinished(true /* healthy */, false /* no unsent logs */);
840 // If there are unsent logs, send the next one. If not, start the asynchronous
841 // process of finalizing the current log for upload.
842 if (state_
== SENDING_OLD_LOGS
) {
843 DCHECK(log_manager_
.has_unsent_logs());
844 log_manager_
.StageNextLogForUpload();
847 client_
->CollectFinalMetrics(
848 base::Bind(&MetricsService::OnFinalLogInfoCollectionDone
,
849 self_ptr_factory_
.GetWeakPtr()));
853 void MetricsService::OnFinalLogInfoCollectionDone() {
854 // If somehow there is a log upload in progress, we return and hope things
855 // work out. The scheduler isn't informed since if this happens, the scheduler
856 // will get a response from the upload.
857 DCHECK(!log_upload_in_progress_
);
858 if (log_upload_in_progress_
)
861 // Abort if metrics were turned off during the final info gathering.
862 if (!recording_active()) {
864 scheduler_
->UploadCancelled();
870 // If logs shouldn't be uploaded, stop here. It's important that this check
871 // be after StageNewLog(), otherwise the previous logs will never be loaded,
872 // and thus the open log won't be persisted.
873 // TODO(stuartmorgan): This is unnecessarily complicated; restructure loading
874 // of previous logs to not require running part of the upload logic.
875 // http://crbug.com/157337
876 if (!reporting_active()) {
878 scheduler_
->UploadCancelled();
885 void MetricsService::StageNewLog() {
886 if (log_manager_
.has_staged_log())
891 case INIT_TASK_SCHEDULED
: // We should be further along by now.
896 if (NewInitialMetricsTimingEnabled()) {
897 PrepareInitialMetricsLog();
898 // Stage the first log, which could be a stability log (either one
899 // for created in this session or from a previous session) or the
900 // initial metrics log that was just created.
901 log_manager_
.StageNextLogForUpload();
902 if (has_initial_stability_log_
) {
903 // The initial stability log was just staged.
904 has_initial_stability_log_
= false;
905 state_
= SENDING_INITIAL_STABILITY_LOG
;
907 state_
= SENDING_INITIAL_METRICS_LOG
;
910 if (has_initial_stability_log_
) {
911 // There's an initial stability log, ready to send.
912 log_manager_
.StageNextLogForUpload();
913 has_initial_stability_log_
= false;
914 state_
= SENDING_INITIAL_STABILITY_LOG
;
916 PrepareInitialMetricsLog();
917 log_manager_
.StageNextLogForUpload();
918 state_
= SENDING_INITIAL_METRICS_LOG
;
923 case SENDING_OLD_LOGS
:
924 NOTREACHED(); // Shouldn't be staging a new log during old log sending.
927 case SENDING_CURRENT_LOGS
:
930 log_manager_
.StageNextLogForUpload();
938 DCHECK(log_manager_
.has_staged_log());
941 bool MetricsService::ProvidersHaveStabilityMetrics() {
942 // Check whether any metrics provider has stability metrics.
943 for (size_t i
= 0; i
< metrics_providers_
.size(); ++i
) {
944 if (metrics_providers_
[i
]->HasStabilityMetrics())
951 void MetricsService::PrepareInitialStabilityLog() {
952 DCHECK_EQ(INITIALIZED
, state_
);
954 scoped_ptr
<MetricsLog
> initial_stability_log(
955 CreateLog(MetricsLog::INITIAL_STABILITY_LOG
));
957 // Do not call NotifyOnDidCreateMetricsLog here because the stability
958 // log describes stats from the _previous_ session.
960 if (!initial_stability_log
->LoadSavedEnvironmentFromPrefs())
963 log_manager_
.PauseCurrentLog();
964 log_manager_
.BeginLoggingWithLog(initial_stability_log
.Pass());
966 // Note: Some stability providers may record stability stats via histograms,
967 // so this call has to be after BeginLoggingWithLog().
968 log_manager_
.current_log()->RecordStabilityMetrics(
969 metrics_providers_
.get(), base::TimeDelta(), base::TimeDelta());
970 RecordCurrentStabilityHistograms();
972 // Note: RecordGeneralMetrics() intentionally not called since this log is for
973 // stability stats from a previous session only.
975 log_manager_
.FinishCurrentLog();
976 log_manager_
.ResumePausedLog();
978 // Store unsent logs, including the stability log that was just saved, so
979 // that they're not lost in case of a crash before upload time.
980 log_manager_
.PersistUnsentLogs();
982 has_initial_stability_log_
= true;
985 void MetricsService::PrepareInitialMetricsLog() {
986 DCHECK(state_
== INIT_TASK_DONE
|| state_
== SENDING_INITIAL_STABILITY_LOG
);
988 RecordCurrentEnvironment(initial_metrics_log_
.get());
989 base::TimeDelta incremental_uptime
;
990 base::TimeDelta uptime
;
991 GetUptimes(local_state_
, &incremental_uptime
, &uptime
);
993 // Histograms only get written to the current log, so make the new log current
994 // before writing them.
995 log_manager_
.PauseCurrentLog();
996 log_manager_
.BeginLoggingWithLog(initial_metrics_log_
.Pass());
998 // Note: Some stability providers may record stability stats via histograms,
999 // so this call has to be after BeginLoggingWithLog().
1000 MetricsLog
* current_log
= log_manager_
.current_log();
1001 current_log
->RecordStabilityMetrics(metrics_providers_
.get(),
1002 base::TimeDelta(), base::TimeDelta());
1003 current_log
->RecordGeneralMetrics(metrics_providers_
.get());
1004 RecordCurrentHistograms();
1006 log_manager_
.FinishCurrentLog();
1007 log_manager_
.ResumePausedLog();
1009 // Store unsent logs, including the initial log that was just saved, so
1010 // that they're not lost in case of a crash before upload time.
1011 log_manager_
.PersistUnsentLogs();
1014 void MetricsService::SendStagedLog() {
1015 DCHECK(log_manager_
.has_staged_log());
1016 if (!log_manager_
.has_staged_log())
1019 DCHECK(!log_upload_in_progress_
);
1020 log_upload_in_progress_
= true;
1022 if (!log_uploader_
) {
1023 log_uploader_
= client_
->CreateUploader(
1024 kServerUrl
, kMimeType
,
1025 base::Bind(&MetricsService::OnLogUploadComplete
,
1026 self_ptr_factory_
.GetWeakPtr()));
1029 const std::string hash
=
1030 base::HexEncode(log_manager_
.staged_log_hash().data(),
1031 log_manager_
.staged_log_hash().size());
1032 bool success
= log_uploader_
->UploadLog(log_manager_
.staged_log(), hash
);
1033 UMA_HISTOGRAM_BOOLEAN("UMA.UploadCreation", success
);
1035 // Skip this upload and hope things work out next time.
1036 log_manager_
.DiscardStagedLog();
1037 scheduler_
->UploadCancelled();
1038 log_upload_in_progress_
= false;
1042 HandleIdleSinceLastTransmission(true);
1046 void MetricsService::OnLogUploadComplete(int response_code
) {
1047 DCHECK(log_upload_in_progress_
);
1048 log_upload_in_progress_
= false;
1050 // Log a histogram to track response success vs. failure rates.
1051 UMA_HISTOGRAM_ENUMERATION("UMA.UploadResponseStatus.Protobuf",
1052 ResponseCodeToStatus(response_code
),
1053 NUM_RESPONSE_STATUSES
);
1055 bool upload_succeeded
= response_code
== 200;
1057 // Provide boolean for error recovery (allow us to ignore response_code).
1058 bool discard_log
= false;
1059 const size_t log_size
= log_manager_
.staged_log().length();
1060 if (upload_succeeded
) {
1061 UMA_HISTOGRAM_COUNTS_10000("UMA.LogSize.OnSuccess", log_size
/ 1024);
1062 } else if (log_size
> kUploadLogAvoidRetransmitSize
) {
1063 UMA_HISTOGRAM_COUNTS("UMA.Large Rejected Log was Discarded",
1064 static_cast<int>(log_size
));
1066 } else if (response_code
== 400) {
1067 // Bad syntax. Retransmission won't work.
1071 if (upload_succeeded
|| discard_log
) {
1072 log_manager_
.DiscardStagedLog();
1073 // Store the updated list to disk now that the removed log is uploaded.
1074 log_manager_
.PersistUnsentLogs();
1077 if (!log_manager_
.has_staged_log()) {
1079 case SENDING_INITIAL_STABILITY_LOG
:
1080 if (NewInitialMetricsTimingEnabled()) {
1081 // The initial metrics log is already in the queue of unsent logs.
1082 state_
= SENDING_OLD_LOGS
;
1084 PrepareInitialMetricsLog();
1085 log_manager_
.StageNextLogForUpload();
1087 state_
= SENDING_INITIAL_METRICS_LOG
;
1091 case SENDING_INITIAL_METRICS_LOG
:
1092 state_
= log_manager_
.has_unsent_logs() ? SENDING_OLD_LOGS
1093 : SENDING_CURRENT_LOGS
;
1096 case SENDING_OLD_LOGS
:
1097 if (!log_manager_
.has_unsent_logs())
1098 state_
= SENDING_CURRENT_LOGS
;
1101 case SENDING_CURRENT_LOGS
:
1109 if (log_manager_
.has_unsent_logs())
1110 DCHECK_LT(state_
, SENDING_CURRENT_LOGS
);
1113 // Error 400 indicates a problem with the log, not with the server, so
1114 // don't consider that a sign that the server is in trouble.
1115 bool server_is_healthy
= upload_succeeded
|| response_code
== 400;
1116 // Don't notify the scheduler that the upload is finished if we've only sent
1117 // the initial stability log, but not yet the initial metrics log (treat the
1118 // two as a single unit of work as far as the scheduler is concerned).
1119 if (state_
!= SENDING_INITIAL_METRICS_LOG
) {
1120 scheduler_
->UploadFinished(server_is_healthy
,
1121 log_manager_
.has_unsent_logs());
1124 if (server_is_healthy
)
1125 client_
->OnLogUploadComplete();
1128 void MetricsService::IncrementPrefValue(const char* path
) {
1129 int value
= local_state_
->GetInteger(path
);
1130 local_state_
->SetInteger(path
, value
+ 1);
1133 void MetricsService::IncrementLongPrefsValue(const char* path
) {
1134 int64 value
= local_state_
->GetInt64(path
);
1135 local_state_
->SetInt64(path
, value
+ 1);
1138 bool MetricsService::UmaMetricsProperlyShutdown() {
1139 CHECK(clean_shutdown_status_
== CLEANLY_SHUTDOWN
||
1140 clean_shutdown_status_
== NEED_TO_SHUTDOWN
);
1141 return clean_shutdown_status_
== CLEANLY_SHUTDOWN
;
1144 void MetricsService::AddSyntheticTrialObserver(
1145 SyntheticTrialObserver
* observer
) {
1146 synthetic_trial_observer_list_
.AddObserver(observer
);
1147 if (!synthetic_trial_groups_
.empty())
1148 observer
->OnSyntheticTrialsChanged(synthetic_trial_groups_
);
1151 void MetricsService::RemoveSyntheticTrialObserver(
1152 SyntheticTrialObserver
* observer
) {
1153 synthetic_trial_observer_list_
.RemoveObserver(observer
);
1156 void MetricsService::RegisterSyntheticFieldTrial(
1157 const SyntheticTrialGroup
& trial
) {
1158 for (size_t i
= 0; i
< synthetic_trial_groups_
.size(); ++i
) {
1159 if (synthetic_trial_groups_
[i
].id
.name
== trial
.id
.name
) {
1160 if (synthetic_trial_groups_
[i
].id
.group
!= trial
.id
.group
) {
1161 synthetic_trial_groups_
[i
].id
.group
= trial
.id
.group
;
1162 synthetic_trial_groups_
[i
].start_time
= base::TimeTicks::Now();
1163 NotifySyntheticTrialObservers();
1169 SyntheticTrialGroup trial_group
= trial
;
1170 trial_group
.start_time
= base::TimeTicks::Now();
1171 synthetic_trial_groups_
.push_back(trial_group
);
1172 NotifySyntheticTrialObservers();
1175 void MetricsService::RegisterMetricsProvider(
1176 scoped_ptr
<MetricsProvider
> provider
) {
1177 DCHECK_EQ(INITIALIZED
, state_
);
1178 metrics_providers_
.push_back(provider
.release());
1181 void MetricsService::CheckForClonedInstall(
1182 scoped_refptr
<base::SingleThreadTaskRunner
> task_runner
) {
1183 state_manager_
->CheckForClonedInstall(task_runner
);
1186 void MetricsService::NotifySyntheticTrialObservers() {
1187 FOR_EACH_OBSERVER(SyntheticTrialObserver
, synthetic_trial_observer_list_
,
1188 OnSyntheticTrialsChanged(synthetic_trial_groups_
));
1191 void MetricsService::GetCurrentSyntheticFieldTrials(
1192 std::vector
<variations::ActiveGroupId
>* synthetic_trials
) {
1193 DCHECK(synthetic_trials
);
1194 synthetic_trials
->clear();
1195 const MetricsLog
* current_log
= log_manager_
.current_log();
1196 for (size_t i
= 0; i
< synthetic_trial_groups_
.size(); ++i
) {
1197 if (synthetic_trial_groups_
[i
].start_time
<= current_log
->creation_time())
1198 synthetic_trials
->push_back(synthetic_trial_groups_
[i
].id
);
1202 scoped_ptr
<MetricsLog
> MetricsService::CreateLog(MetricsLog::LogType log_type
) {
1203 return make_scoped_ptr(new MetricsLog(state_manager_
->client_id(),
1210 void MetricsService::RecordCurrentEnvironment(MetricsLog
* log
) {
1211 std::vector
<variations::ActiveGroupId
> synthetic_trials
;
1212 GetCurrentSyntheticFieldTrials(&synthetic_trials
);
1213 log
->RecordEnvironment(metrics_providers_
.get(), synthetic_trials
,
1215 UMA_HISTOGRAM_COUNTS_100("UMA.SyntheticTrials.Count",
1216 synthetic_trials
.size());
1219 void MetricsService::RecordCurrentHistograms() {
1220 DCHECK(log_manager_
.current_log());
1221 histogram_snapshot_manager_
.PrepareDeltas(
1222 base::Histogram::kNoFlags
, base::Histogram::kUmaTargetedHistogramFlag
);
1225 void MetricsService::RecordCurrentStabilityHistograms() {
1226 DCHECK(log_manager_
.current_log());
1227 histogram_snapshot_manager_
.PrepareDeltas(
1228 base::Histogram::kNoFlags
, base::Histogram::kUmaStabilityHistogramFlag
);
1231 void MetricsService::LogCleanShutdown() {
1232 // Redundant hack to write pref ASAP.
1233 MarkAppCleanShutdownAndCommit(&clean_exit_beacon_
, local_state_
);
1235 // Redundant setting to assure that we always reset this value at shutdown
1236 // (and that we don't use some alternate path, and not call LogCleanShutdown).
1237 clean_shutdown_status_
= CLEANLY_SHUTDOWN
;
1239 clean_exit_beacon_
.WriteBeaconValue(true);
1240 RecordCurrentState(local_state_
);
1241 local_state_
->SetInteger(prefs::kStabilityExecutionPhase
,
1242 MetricsService::SHUTDOWN_COMPLETE
);
1245 bool MetricsService::ShouldLogEvents() {
1246 // We simply don't log events to UMA if there is a single incognito
1247 // session visible. The problem is that we always notify using the orginal
1248 // profile in order to simplify notification processing.
1249 return !client_
->IsOffTheRecordSessionActive();
1252 void MetricsService::RecordBooleanPrefValue(const char* path
, bool value
) {
1253 DCHECK(IsSingleThreaded());
1254 local_state_
->SetBoolean(path
, value
);
1255 RecordCurrentState(local_state_
);
1258 void MetricsService::RecordCurrentState(PrefService
* pref
) {
1259 pref
->SetInt64(prefs::kStabilityLastTimestampSec
,
1260 base::Time::Now().ToTimeT());
1263 } // namespace metrics