ozone: evdev: Sync caps lock LED state to evdev
[chromium-blink-merge.git] / components / metrics / metrics_service.cc
blob54e4deb73c6a2f5af67a2139d72af6869bd5a4e3
1 // Copyright 2014 The Chromium Authors. All rights reserved.
2 // Use of this source code is governed by a BSD-style license that can be
3 // found in the LICENSE file.
5 //------------------------------------------------------------------------------
6 // Description of the life cycle of a instance of MetricsService.
7 //
8 // OVERVIEW
9 //
10 // A MetricsService instance is typically created at application startup. It is
11 // the central controller for the acquisition of log data, and the automatic
12 // transmission of that log data to an external server. Its major job is to
13 // manage logs, grouping them for transmission, and transmitting them. As part
14 // of its grouping, MS finalizes logs by including some just-in-time gathered
15 // memory statistics, snapshotting the current stats of numerous histograms,
16 // closing the logs, translating to protocol buffer format, and compressing the
17 // results for transmission. Transmission includes submitting a compressed log
18 // as data in a URL-post, and retransmitting (or retaining at process
19 // termination) if the attempted transmission failed. Retention across process
20 // terminations is done using the the PrefServices facilities. The retained logs
21 // (the ones that never got transmitted) are compressed and base64-encoded
22 // before being persisted.
24 // Logs fall into one of two categories: "initial logs," and "ongoing logs."
25 // There is at most one initial log sent for each complete run of Chrome (from
26 // startup, to browser shutdown). An initial log is generally transmitted some
27 // short time (1 minute?) after startup, and includes stats such as recent crash
28 // info, the number and types of plugins, etc. The external server's response
29 // to the initial log conceptually tells this MS if it should continue
30 // transmitting logs (during this session). The server response can actually be
31 // much more detailed, and always includes (at a minimum) how often additional
32 // ongoing logs should be sent.
34 // After the above initial log, a series of ongoing logs will be transmitted.
35 // The first ongoing log actually begins to accumulate information stating when
36 // the MS was first constructed. Note that even though the initial log is
37 // commonly sent a full minute after startup, the initial log does not include
38 // much in the way of user stats. The most common interlog period (delay)
39 // is 30 minutes. That time period starts when the first user action causes a
40 // logging event. This means that if there is no user action, there may be long
41 // periods without any (ongoing) log transmissions. Ongoing logs typically
42 // contain very detailed records of user activities (ex: opened tab, closed
43 // tab, fetched URL, maximized window, etc.) In addition, just before an
44 // ongoing log is closed out, a call is made to gather memory statistics. Those
45 // memory statistics are deposited into a histogram, and the log finalization
46 // code is then called. In the finalization, a call to a Histogram server
47 // acquires a list of all local histograms that have been flagged for upload
48 // to the UMA server. The finalization also acquires the most recent number
49 // of page loads, along with any counts of renderer or plugin crashes.
51 // When the browser shuts down, there will typically be a fragment of an ongoing
52 // log that has not yet been transmitted. At shutdown time, that fragment is
53 // closed (including snapshotting histograms), and persisted, for potential
54 // transmission during a future run of the product.
56 // There are two slightly abnormal shutdown conditions. There is a
57 // "disconnected scenario," and a "really fast startup and shutdown" scenario.
58 // In the "never connected" situation, the user has (during the running of the
59 // process) never established an internet connection. As a result, attempts to
60 // transmit the initial log have failed, and a lot(?) of data has accumulated in
61 // the ongoing log (which didn't yet get closed, because there was never even a
62 // contemplation of sending it). There is also a kindred "lost connection"
63 // situation, where a loss of connection prevented an ongoing log from being
64 // transmitted, and a (still open) log was stuck accumulating a lot(?) of data,
65 // while the earlier log retried its transmission. In both of these
66 // disconnected situations, two logs need to be, and are, persistently stored
67 // for future transmission.
69 // The other unusual shutdown condition, termed "really fast startup and
70 // shutdown," involves the deliberate user termination of the process before
71 // the initial log is even formed or transmitted. In that situation, no logging
72 // is done, but the historical crash statistics remain (unlogged) for inclusion
73 // in a future run's initial log. (i.e., we don't lose crash stats).
75 // With the above overview, we can now describe the state machine's various
76 // states, based on the State enum specified in the state_ member. Those states
77 // are:
79 // INITIALIZED, // Constructor was called.
80 // INIT_TASK_SCHEDULED, // Waiting for deferred init tasks to finish.
81 // INIT_TASK_DONE, // Waiting for timer to send initial log.
82 // SENDING_INITIAL_STABILITY_LOG, // Initial stability log being sent.
83 // SENDING_INITIAL_METRICS_LOG, // Initial metrics log being sent.
84 // SENDING_OLD_LOGS, // Sending unsent logs from previous session.
85 // SENDING_CURRENT_LOGS, // Sending ongoing logs as they acrue.
87 // In more detail, we have:
89 // INITIALIZED, // Constructor was called.
90 // The MS has been constructed, but has taken no actions to compose the
91 // initial log.
93 // INIT_TASK_SCHEDULED, // Waiting for deferred init tasks to finish.
94 // Typically about 30 seconds after startup, a task is sent to a second thread
95 // (the file thread) to perform deferred (lower priority and slower)
96 // initialization steps such as getting the list of plugins. That task will
97 // (when complete) make an async callback (via a Task) to indicate the
98 // completion.
100 // INIT_TASK_DONE, // Waiting for timer to send initial log.
101 // The callback has arrived, and it is now possible for an initial log to be
102 // created. This callback typically arrives back less than one second after
103 // the deferred init task is dispatched.
105 // SENDING_INITIAL_STABILITY_LOG, // Initial stability log being sent.
106 // During initialization, if a crash occurred during the previous session, an
107 // initial stability log will be generated and registered with the log manager.
108 // This state will be entered if a stability log was prepared during metrics
109 // service initialization (in InitializeMetricsRecordingState()) and is waiting
110 // to be transmitted when it's time to send up the first log (per the reporting
111 // scheduler). If there is no initial stability log (e.g. there was no previous
112 // crash), then this state will be skipped and the state will advance to
113 // SENDING_INITIAL_METRICS_LOG.
115 // SENDING_INITIAL_METRICS_LOG, // Initial metrics log being sent.
116 // This state is entered after the initial metrics log has been composed, and
117 // prepared for transmission. This happens after SENDING_INITIAL_STABILITY_LOG
118 // if there was an initial stability log (see above). It is also the case that
119 // any previously unsent logs have been loaded into instance variables for
120 // possible transmission.
122 // SENDING_OLD_LOGS, // Sending unsent logs from previous session.
123 // This state indicates that the initial log for this session has been
124 // successfully sent and it is now time to send any logs that were
125 // saved from previous sessions. All such logs will be transmitted before
126 // exiting this state, and proceeding with ongoing logs from the current session
127 // (see next state).
129 // SENDING_CURRENT_LOGS, // Sending standard current logs as they accrue.
130 // Current logs are being accumulated. Typically every 20 minutes a log is
131 // closed and finalized for transmission, at the same time as a new log is
132 // started.
134 // The progression through the above states is simple, and sequential, in the
135 // most common use cases. States proceed from INITIAL to SENDING_CURRENT_LOGS,
136 // and remain in the latter until shutdown.
138 // The one unusual case is when the user asks that we stop logging. When that
139 // happens, any staged (transmission in progress) log is persisted, and any log
140 // that is currently accumulating is also finalized and persisted. We then
141 // regress back to the SEND_OLD_LOGS state in case the user enables log
142 // recording again during this session. This way anything we have persisted
143 // will be sent automatically if/when we progress back to SENDING_CURRENT_LOG
144 // state.
146 // Another similar case is on mobile, when the application is backgrounded and
147 // then foregrounded again. Backgrounding created new "old" stored logs, so the
148 // state drops back from SENDING_CURRENT_LOGS to SENDING_OLD_LOGS so those logs
149 // will be sent.
151 // Also note that whenever we successfully send an old log, we mirror the list
152 // of logs into the PrefService. This ensures that IF we crash, we won't start
153 // up and retransmit our old logs again.
155 // Due to race conditions, it is always possible that a log file could be sent
156 // twice. For example, if a log file is sent, but not yet acknowledged by
157 // the external server, and the user shuts down, then a copy of the log may be
158 // saved for re-transmission. These duplicates could be filtered out server
159 // side, but are not expected to be a significant problem.
162 //------------------------------------------------------------------------------
164 #include "components/metrics/metrics_service.h"
166 #include <algorithm>
168 #include "base/bind.h"
169 #include "base/callback.h"
170 #include "base/metrics/histogram.h"
171 #include "base/metrics/histogram_base.h"
172 #include "base/metrics/histogram_samples.h"
173 #include "base/metrics/sparse_histogram.h"
174 #include "base/metrics/statistics_recorder.h"
175 #include "base/prefs/pref_registry_simple.h"
176 #include "base/prefs/pref_service.h"
177 #include "base/strings/string_number_conversions.h"
178 #include "base/strings/utf_string_conversions.h"
179 #include "base/threading/platform_thread.h"
180 #include "base/threading/thread.h"
181 #include "base/threading/thread_restrictions.h"
182 #include "base/time/time.h"
183 #include "base/tracked_objects.h"
184 #include "base/values.h"
185 #include "components/metrics/metrics_log.h"
186 #include "components/metrics/metrics_log_manager.h"
187 #include "components/metrics/metrics_log_uploader.h"
188 #include "components/metrics/metrics_pref_names.h"
189 #include "components/metrics/metrics_reporting_scheduler.h"
190 #include "components/metrics/metrics_service_client.h"
191 #include "components/metrics/metrics_state_manager.h"
192 #include "components/variations/entropy_provider.h"
194 namespace metrics {
196 namespace {
198 // Check to see that we're being called on only one thread.
199 bool IsSingleThreaded() {
200 static base::PlatformThreadId thread_id = 0;
201 if (!thread_id)
202 thread_id = base::PlatformThread::CurrentId();
203 return base::PlatformThread::CurrentId() == thread_id;
206 // The delay, in seconds, after starting recording before doing expensive
207 // initialization work.
208 #if defined(OS_ANDROID) || defined(OS_IOS)
209 // On mobile devices, a significant portion of sessions last less than a minute.
210 // Use a shorter timer on these platforms to avoid losing data.
211 // TODO(dfalcantara): To avoid delaying startup, tighten up initialization so
212 // that it occurs after the user gets their initial page.
213 const int kInitializationDelaySeconds = 5;
214 #else
215 const int kInitializationDelaySeconds = 30;
216 #endif
218 // The maximum number of events in a log uploaded to the UMA server.
219 const int kEventLimit = 2400;
221 // If an upload fails, and the transmission was over this byte count, then we
222 // will discard the log, and not try to retransmit it. We also don't persist
223 // the log to the prefs for transmission during the next chrome session if this
224 // limit is exceeded.
225 const size_t kUploadLogAvoidRetransmitSize = 100 * 1024;
227 // Interval, in minutes, between state saves.
228 const int kSaveStateIntervalMinutes = 5;
230 enum ResponseStatus {
231 UNKNOWN_FAILURE,
232 SUCCESS,
233 BAD_REQUEST, // Invalid syntax or log too large.
234 NO_RESPONSE,
235 NUM_RESPONSE_STATUSES
238 ResponseStatus ResponseCodeToStatus(int response_code) {
239 switch (response_code) {
240 case -1:
241 return NO_RESPONSE;
242 case 200:
243 return SUCCESS;
244 case 400:
245 return BAD_REQUEST;
246 default:
247 return UNKNOWN_FAILURE;
251 void MarkAppCleanShutdownAndCommit(CleanExitBeacon* clean_exit_beacon,
252 PrefService* local_state) {
253 clean_exit_beacon->WriteBeaconValue(true);
254 local_state->SetInteger(prefs::kStabilityExecutionPhase,
255 MetricsService::SHUTDOWN_COMPLETE);
256 // Start writing right away (write happens on a different thread).
257 local_state->CommitPendingWrite();
260 } // namespace
263 SyntheticTrialGroup::SyntheticTrialGroup(uint32 trial, uint32 group) {
264 id.name = trial;
265 id.group = group;
268 SyntheticTrialGroup::~SyntheticTrialGroup() {
271 // static
272 MetricsService::ShutdownCleanliness MetricsService::clean_shutdown_status_ =
273 MetricsService::CLEANLY_SHUTDOWN;
275 MetricsService::ExecutionPhase MetricsService::execution_phase_ =
276 MetricsService::UNINITIALIZED_PHASE;
278 // static
279 void MetricsService::RegisterPrefs(PrefRegistrySimple* registry) {
280 DCHECK(IsSingleThreaded());
281 MetricsStateManager::RegisterPrefs(registry);
282 MetricsLog::RegisterPrefs(registry);
284 registry->RegisterInt64Pref(prefs::kInstallDate, 0);
286 registry->RegisterInt64Pref(prefs::kStabilityLaunchTimeSec, 0);
287 registry->RegisterInt64Pref(prefs::kStabilityLastTimestampSec, 0);
288 registry->RegisterStringPref(prefs::kStabilityStatsVersion, std::string());
289 registry->RegisterInt64Pref(prefs::kStabilityStatsBuildTime, 0);
290 registry->RegisterBooleanPref(prefs::kStabilityExitedCleanly, true);
291 registry->RegisterIntegerPref(prefs::kStabilityExecutionPhase,
292 UNINITIALIZED_PHASE);
293 registry->RegisterBooleanPref(prefs::kStabilitySessionEndCompleted, true);
294 registry->RegisterIntegerPref(prefs::kMetricsSessionID, -1);
296 registry->RegisterListPref(prefs::kMetricsInitialLogs);
297 registry->RegisterListPref(prefs::kMetricsOngoingLogs);
299 registry->RegisterInt64Pref(prefs::kUninstallLaunchCount, 0);
300 registry->RegisterInt64Pref(prefs::kUninstallMetricsUptimeSec, 0);
303 MetricsService::MetricsService(MetricsStateManager* state_manager,
304 MetricsServiceClient* client,
305 PrefService* local_state)
306 : log_manager_(local_state, kUploadLogAvoidRetransmitSize),
307 histogram_snapshot_manager_(this),
308 state_manager_(state_manager),
309 client_(client),
310 local_state_(local_state),
311 clean_exit_beacon_(client->GetRegistryBackupKey(), local_state),
312 recording_active_(false),
313 reporting_active_(false),
314 test_mode_active_(false),
315 state_(INITIALIZED),
316 has_initial_stability_log_(false),
317 log_upload_in_progress_(false),
318 idle_since_last_transmission_(false),
319 session_id_(-1),
320 self_ptr_factory_(this),
321 state_saver_factory_(this) {
322 DCHECK(IsSingleThreaded());
323 DCHECK(state_manager_);
324 DCHECK(client_);
325 DCHECK(local_state_);
327 // Set the install date if this is our first run.
328 int64 install_date = local_state_->GetInt64(prefs::kInstallDate);
329 if (install_date == 0)
330 local_state_->SetInt64(prefs::kInstallDate, base::Time::Now().ToTimeT());
333 MetricsService::~MetricsService() {
334 DisableRecording();
337 void MetricsService::InitializeMetricsRecordingState() {
338 InitializeMetricsState();
340 base::Closure upload_callback =
341 base::Bind(&MetricsService::StartScheduledUpload,
342 self_ptr_factory_.GetWeakPtr());
343 scheduler_.reset(
344 new MetricsReportingScheduler(upload_callback, is_cellular_callback_));
347 void MetricsService::Start() {
348 HandleIdleSinceLastTransmission(false);
349 EnableRecording();
350 EnableReporting();
353 bool MetricsService::StartIfMetricsReportingEnabled() {
354 const bool enabled = state_manager_->IsMetricsReportingEnabled();
355 if (enabled)
356 Start();
357 return enabled;
360 void MetricsService::StartRecordingForTests() {
361 test_mode_active_ = true;
362 EnableRecording();
363 DisableReporting();
366 void MetricsService::Stop() {
367 HandleIdleSinceLastTransmission(false);
368 DisableReporting();
369 DisableRecording();
372 void MetricsService::EnableReporting() {
373 if (reporting_active_)
374 return;
375 reporting_active_ = true;
376 StartSchedulerIfNecessary();
379 void MetricsService::DisableReporting() {
380 reporting_active_ = false;
383 std::string MetricsService::GetClientId() {
384 return state_manager_->client_id();
387 int64 MetricsService::GetInstallDate() {
388 return local_state_->GetInt64(prefs::kInstallDate);
391 scoped_ptr<const base::FieldTrial::EntropyProvider>
392 MetricsService::CreateEntropyProvider() {
393 // TODO(asvitkine): Refactor the code so that MetricsService does not expose
394 // this method.
395 return state_manager_->CreateEntropyProvider();
398 void MetricsService::EnableRecording() {
399 DCHECK(IsSingleThreaded());
401 if (recording_active_)
402 return;
403 recording_active_ = true;
405 state_manager_->ForceClientIdCreation();
406 client_->SetMetricsClientId(state_manager_->client_id());
407 if (!log_manager_.current_log())
408 OpenNewLog();
410 for (size_t i = 0; i < metrics_providers_.size(); ++i)
411 metrics_providers_[i]->OnRecordingEnabled();
413 base::RemoveActionCallback(action_callback_);
414 action_callback_ = base::Bind(&MetricsService::OnUserAction,
415 base::Unretained(this));
416 base::AddActionCallback(action_callback_);
419 void MetricsService::DisableRecording() {
420 DCHECK(IsSingleThreaded());
422 if (!recording_active_)
423 return;
424 recording_active_ = false;
426 base::RemoveActionCallback(action_callback_);
428 for (size_t i = 0; i < metrics_providers_.size(); ++i)
429 metrics_providers_[i]->OnRecordingDisabled();
431 PushPendingLogsToPersistentStorage();
434 bool MetricsService::recording_active() const {
435 DCHECK(IsSingleThreaded());
436 return recording_active_;
439 bool MetricsService::reporting_active() const {
440 DCHECK(IsSingleThreaded());
441 return reporting_active_;
444 void MetricsService::RecordDelta(const base::HistogramBase& histogram,
445 const base::HistogramSamples& snapshot) {
446 log_manager_.current_log()->RecordHistogramDelta(histogram.histogram_name(),
447 snapshot);
450 void MetricsService::InconsistencyDetected(
451 base::HistogramBase::Inconsistency problem) {
452 UMA_HISTOGRAM_ENUMERATION("Histogram.InconsistenciesBrowser",
453 problem, base::HistogramBase::NEVER_EXCEEDED_VALUE);
456 void MetricsService::UniqueInconsistencyDetected(
457 base::HistogramBase::Inconsistency problem) {
458 UMA_HISTOGRAM_ENUMERATION("Histogram.InconsistenciesBrowserUnique",
459 problem, base::HistogramBase::NEVER_EXCEEDED_VALUE);
462 void MetricsService::InconsistencyDetectedInLoggedCount(int amount) {
463 UMA_HISTOGRAM_COUNTS("Histogram.InconsistentSnapshotBrowser",
464 std::abs(amount));
467 void MetricsService::HandleIdleSinceLastTransmission(bool in_idle) {
468 // If there wasn't a lot of action, maybe the computer was asleep, in which
469 // case, the log transmissions should have stopped. Here we start them up
470 // again.
471 if (!in_idle && idle_since_last_transmission_)
472 StartSchedulerIfNecessary();
473 idle_since_last_transmission_ = in_idle;
476 void MetricsService::OnApplicationNotIdle() {
477 if (recording_active_)
478 HandleIdleSinceLastTransmission(false);
481 void MetricsService::RecordStartOfSessionEnd() {
482 LogCleanShutdown();
483 RecordBooleanPrefValue(prefs::kStabilitySessionEndCompleted, false);
486 void MetricsService::RecordCompletedSessionEnd() {
487 LogCleanShutdown();
488 RecordBooleanPrefValue(prefs::kStabilitySessionEndCompleted, true);
491 #if defined(OS_ANDROID) || defined(OS_IOS)
492 void MetricsService::OnAppEnterBackground() {
493 scheduler_->Stop();
495 MarkAppCleanShutdownAndCommit(&clean_exit_beacon_, local_state_);
497 // At this point, there's no way of knowing when the process will be
498 // killed, so this has to be treated similar to a shutdown, closing and
499 // persisting all logs. Unlinke a shutdown, the state is primed to be ready
500 // to continue logging and uploading if the process does return.
501 if (recording_active() && state_ >= SENDING_INITIAL_STABILITY_LOG) {
502 PushPendingLogsToPersistentStorage();
503 // Persisting logs closes the current log, so start recording a new log
504 // immediately to capture any background work that might be done before the
505 // process is killed.
506 OpenNewLog();
510 void MetricsService::OnAppEnterForeground() {
511 clean_exit_beacon_.WriteBeaconValue(false);
512 StartSchedulerIfNecessary();
514 #else
515 void MetricsService::LogNeedForCleanShutdown() {
516 clean_exit_beacon_.WriteBeaconValue(false);
517 // Redundant setting to be sure we call for a clean shutdown.
518 clean_shutdown_status_ = NEED_TO_SHUTDOWN;
520 #endif // defined(OS_ANDROID) || defined(OS_IOS)
522 // static
523 void MetricsService::SetExecutionPhase(ExecutionPhase execution_phase,
524 PrefService* local_state) {
525 execution_phase_ = execution_phase;
526 local_state->SetInteger(prefs::kStabilityExecutionPhase, execution_phase_);
529 void MetricsService::RecordBreakpadRegistration(bool success) {
530 if (!success)
531 IncrementPrefValue(prefs::kStabilityBreakpadRegistrationFail);
532 else
533 IncrementPrefValue(prefs::kStabilityBreakpadRegistrationSuccess);
536 void MetricsService::RecordBreakpadHasDebugger(bool has_debugger) {
537 if (!has_debugger)
538 IncrementPrefValue(prefs::kStabilityDebuggerNotPresent);
539 else
540 IncrementPrefValue(prefs::kStabilityDebuggerPresent);
543 void MetricsService::ClearSavedStabilityMetrics() {
544 for (size_t i = 0; i < metrics_providers_.size(); ++i)
545 metrics_providers_[i]->ClearSavedStabilityMetrics();
547 // Reset the prefs that are managed by MetricsService/MetricsLog directly.
548 local_state_->SetInteger(prefs::kStabilityCrashCount, 0);
549 local_state_->SetInteger(prefs::kStabilityExecutionPhase,
550 UNINITIALIZED_PHASE);
551 local_state_->SetInteger(prefs::kStabilityIncompleteSessionEndCount, 0);
552 local_state_->SetInteger(prefs::kStabilityLaunchCount, 0);
553 local_state_->SetBoolean(prefs::kStabilitySessionEndCompleted, true);
556 //------------------------------------------------------------------------------
557 // private methods
558 //------------------------------------------------------------------------------
561 //------------------------------------------------------------------------------
562 // Initialization methods
564 void MetricsService::InitializeMetricsState() {
565 const int64 buildtime = MetricsLog::GetBuildTime();
566 const std::string version = client_->GetVersionString();
567 bool version_changed = false;
568 if (local_state_->GetInt64(prefs::kStabilityStatsBuildTime) != buildtime ||
569 local_state_->GetString(prefs::kStabilityStatsVersion) != version) {
570 local_state_->SetString(prefs::kStabilityStatsVersion, version);
571 local_state_->SetInt64(prefs::kStabilityStatsBuildTime, buildtime);
572 version_changed = true;
575 log_manager_.LoadPersistedUnsentLogs();
577 session_id_ = local_state_->GetInteger(prefs::kMetricsSessionID);
579 if (!clean_exit_beacon_.exited_cleanly()) {
580 IncrementPrefValue(prefs::kStabilityCrashCount);
581 // Reset flag, and wait until we call LogNeedForCleanShutdown() before
582 // monitoring.
583 clean_exit_beacon_.WriteBeaconValue(true);
586 if (!clean_exit_beacon_.exited_cleanly() || ProvidersHaveStabilityMetrics()) {
587 // TODO(rtenneti): On windows, consider saving/getting execution_phase from
588 // the registry.
589 int execution_phase =
590 local_state_->GetInteger(prefs::kStabilityExecutionPhase);
591 UMA_HISTOGRAM_SPARSE_SLOWLY("Chrome.Browser.CrashedExecutionPhase",
592 execution_phase);
594 // If the previous session didn't exit cleanly, or if any provider
595 // explicitly requests it, prepare an initial stability log -
596 // provided UMA is enabled.
597 if (state_manager_->IsMetricsReportingEnabled())
598 PrepareInitialStabilityLog();
601 // If no initial stability log was generated and there was a version upgrade,
602 // clear the stability stats from the previous version (so that they don't get
603 // attributed to the current version). This could otherwise happen due to a
604 // number of different edge cases, such as if the last version crashed before
605 // it could save off a system profile or if UMA reporting is disabled (which
606 // normally results in stats being accumulated).
607 if (!has_initial_stability_log_ && version_changed)
608 ClearSavedStabilityMetrics();
610 // Update session ID.
611 ++session_id_;
612 local_state_->SetInteger(prefs::kMetricsSessionID, session_id_);
614 // Stability bookkeeping
615 IncrementPrefValue(prefs::kStabilityLaunchCount);
617 DCHECK_EQ(UNINITIALIZED_PHASE, execution_phase_);
618 SetExecutionPhase(START_METRICS_RECORDING, local_state_);
620 if (!local_state_->GetBoolean(prefs::kStabilitySessionEndCompleted)) {
621 IncrementPrefValue(prefs::kStabilityIncompleteSessionEndCount);
622 // This is marked false when we get a WM_ENDSESSION.
623 local_state_->SetBoolean(prefs::kStabilitySessionEndCompleted, true);
626 // Call GetUptimes() for the first time, thus allowing all later calls
627 // to record incremental uptimes accurately.
628 base::TimeDelta ignored_uptime_parameter;
629 base::TimeDelta startup_uptime;
630 GetUptimes(local_state_, &startup_uptime, &ignored_uptime_parameter);
631 DCHECK_EQ(0, startup_uptime.InMicroseconds());
632 // For backwards compatibility, leave this intact in case Omaha is checking
633 // them. prefs::kStabilityLastTimestampSec may also be useless now.
634 // TODO(jar): Delete these if they have no uses.
635 local_state_->SetInt64(prefs::kStabilityLaunchTimeSec,
636 base::Time::Now().ToTimeT());
638 // Bookkeeping for the uninstall metrics.
639 IncrementLongPrefsValue(prefs::kUninstallLaunchCount);
641 // Kick off the process of saving the state (so the uptime numbers keep
642 // getting updated) every n minutes.
643 ScheduleNextStateSave();
646 void MetricsService::OnUserAction(const std::string& action) {
647 if (!ShouldLogEvents())
648 return;
650 log_manager_.current_log()->RecordUserAction(action);
651 HandleIdleSinceLastTransmission(false);
654 void MetricsService::FinishedGatheringInitialMetrics() {
655 DCHECK_EQ(INIT_TASK_SCHEDULED, state_);
656 state_ = INIT_TASK_DONE;
658 // Create the initial log.
659 if (!initial_metrics_log_.get()) {
660 initial_metrics_log_ = CreateLog(MetricsLog::ONGOING_LOG);
661 NotifyOnDidCreateMetricsLog();
664 scheduler_->InitTaskComplete();
667 void MetricsService::GetUptimes(PrefService* pref,
668 base::TimeDelta* incremental_uptime,
669 base::TimeDelta* uptime) {
670 base::TimeTicks now = base::TimeTicks::Now();
671 // If this is the first call, init |first_updated_time_| and
672 // |last_updated_time_|.
673 if (last_updated_time_.is_null()) {
674 first_updated_time_ = now;
675 last_updated_time_ = now;
677 *incremental_uptime = now - last_updated_time_;
678 *uptime = now - first_updated_time_;
679 last_updated_time_ = now;
681 const int64 incremental_time_secs = incremental_uptime->InSeconds();
682 if (incremental_time_secs > 0) {
683 int64 metrics_uptime = pref->GetInt64(prefs::kUninstallMetricsUptimeSec);
684 metrics_uptime += incremental_time_secs;
685 pref->SetInt64(prefs::kUninstallMetricsUptimeSec, metrics_uptime);
689 void MetricsService::NotifyOnDidCreateMetricsLog() {
690 DCHECK(IsSingleThreaded());
691 for (size_t i = 0; i < metrics_providers_.size(); ++i)
692 metrics_providers_[i]->OnDidCreateMetricsLog();
695 //------------------------------------------------------------------------------
696 // State save methods
698 void MetricsService::ScheduleNextStateSave() {
699 state_saver_factory_.InvalidateWeakPtrs();
701 base::MessageLoop::current()->PostDelayedTask(FROM_HERE,
702 base::Bind(&MetricsService::SaveLocalState,
703 state_saver_factory_.GetWeakPtr()),
704 base::TimeDelta::FromMinutes(kSaveStateIntervalMinutes));
707 void MetricsService::SaveLocalState() {
708 RecordCurrentState(local_state_);
710 // TODO(jar):110021 Does this run down the batteries????
711 ScheduleNextStateSave();
715 //------------------------------------------------------------------------------
716 // Recording control methods
718 void MetricsService::OpenNewLog() {
719 DCHECK(!log_manager_.current_log());
721 log_manager_.BeginLoggingWithLog(CreateLog(MetricsLog::ONGOING_LOG));
722 NotifyOnDidCreateMetricsLog();
723 if (state_ == INITIALIZED) {
724 // We only need to schedule that run once.
725 state_ = INIT_TASK_SCHEDULED;
727 base::MessageLoop::current()->PostDelayedTask(
728 FROM_HERE,
729 base::Bind(&MetricsService::StartGatheringMetrics,
730 self_ptr_factory_.GetWeakPtr()),
731 base::TimeDelta::FromSeconds(kInitializationDelaySeconds));
735 void MetricsService::StartGatheringMetrics() {
736 client_->StartGatheringMetrics(
737 base::Bind(&MetricsService::FinishedGatheringInitialMetrics,
738 self_ptr_factory_.GetWeakPtr()));
741 void MetricsService::CloseCurrentLog() {
742 if (!log_manager_.current_log())
743 return;
745 // TODO(jar): Integrate bounds on log recording more consistently, so that we
746 // can stop recording logs that are too big much sooner.
747 if (log_manager_.current_log()->num_events() > kEventLimit) {
748 UMA_HISTOGRAM_COUNTS("UMA.Discarded Log Events",
749 log_manager_.current_log()->num_events());
750 log_manager_.DiscardCurrentLog();
751 OpenNewLog(); // Start trivial log to hold our histograms.
754 // Put incremental data (histogram deltas, and realtime stats deltas) at the
755 // end of all log transmissions (initial log handles this separately).
756 // RecordIncrementalStabilityElements only exists on the derived
757 // MetricsLog class.
758 MetricsLog* current_log = log_manager_.current_log();
759 DCHECK(current_log);
760 RecordCurrentEnvironment(current_log);
761 base::TimeDelta incremental_uptime;
762 base::TimeDelta uptime;
763 GetUptimes(local_state_, &incremental_uptime, &uptime);
764 current_log->RecordStabilityMetrics(metrics_providers_.get(),
765 incremental_uptime, uptime);
767 current_log->RecordGeneralMetrics(metrics_providers_.get());
768 RecordCurrentHistograms();
770 log_manager_.FinishCurrentLog();
773 void MetricsService::PushPendingLogsToPersistentStorage() {
774 if (state_ < SENDING_INITIAL_STABILITY_LOG)
775 return; // We didn't and still don't have time to get plugin list etc.
777 CloseCurrentLog();
778 log_manager_.PersistUnsentLogs();
780 // If there was a staged and/or current log, then there is now at least one
781 // log waiting to be uploaded.
782 if (log_manager_.has_unsent_logs())
783 state_ = SENDING_OLD_LOGS;
786 //------------------------------------------------------------------------------
787 // Transmission of logs methods
789 void MetricsService::StartSchedulerIfNecessary() {
790 // Never schedule cutting or uploading of logs in test mode.
791 if (test_mode_active_)
792 return;
794 // Even if reporting is disabled, the scheduler is needed to trigger the
795 // creation of the initial log, which must be done in order for any logs to be
796 // persisted on shutdown or backgrounding.
797 if (recording_active() &&
798 (reporting_active() || state_ < SENDING_INITIAL_STABILITY_LOG)) {
799 scheduler_->Start();
803 void MetricsService::StartScheduledUpload() {
804 // If we're getting no notifications, then the log won't have much in it, and
805 // it's possible the computer is about to go to sleep, so don't upload and
806 // stop the scheduler.
807 // If recording has been turned off, the scheduler doesn't need to run.
808 // If reporting is off, proceed if the initial log hasn't been created, since
809 // that has to happen in order for logs to be cut and stored when persisting.
810 // TODO(stuartmorgan): Call Stop() on the scheduler when reporting and/or
811 // recording are turned off instead of letting it fire and then aborting.
812 if (idle_since_last_transmission_ ||
813 !recording_active() ||
814 (!reporting_active() && state_ >= SENDING_INITIAL_STABILITY_LOG)) {
815 scheduler_->Stop();
816 scheduler_->UploadCancelled();
817 return;
820 // If the callback was to upload an old log, but there no longer is one,
821 // just report success back to the scheduler to begin the ongoing log
822 // callbacks.
823 // TODO(stuartmorgan): Consider removing the distinction between
824 // SENDING_OLD_LOGS and SENDING_CURRENT_LOGS to simplify the state machine
825 // now that the log upload flow is the same for both modes.
826 if (state_ == SENDING_OLD_LOGS && !log_manager_.has_unsent_logs()) {
827 state_ = SENDING_CURRENT_LOGS;
828 scheduler_->UploadFinished(true /* healthy */, false /* no unsent logs */);
829 return;
831 // If there are unsent logs, send the next one. If not, start the asynchronous
832 // process of finalizing the current log for upload.
833 if (state_ == SENDING_OLD_LOGS) {
834 DCHECK(log_manager_.has_unsent_logs());
835 log_manager_.StageNextLogForUpload();
836 SendStagedLog();
837 } else {
838 client_->CollectFinalMetrics(
839 base::Bind(&MetricsService::OnFinalLogInfoCollectionDone,
840 self_ptr_factory_.GetWeakPtr()));
844 void MetricsService::OnFinalLogInfoCollectionDone() {
845 // If somehow there is a log upload in progress, we return and hope things
846 // work out. The scheduler isn't informed since if this happens, the scheduler
847 // will get a response from the upload.
848 DCHECK(!log_upload_in_progress_);
849 if (log_upload_in_progress_)
850 return;
852 // Abort if metrics were turned off during the final info gathering.
853 if (!recording_active()) {
854 scheduler_->Stop();
855 scheduler_->UploadCancelled();
856 return;
859 StageNewLog();
861 // If logs shouldn't be uploaded, stop here. It's important that this check
862 // be after StageNewLog(), otherwise the previous logs will never be loaded,
863 // and thus the open log won't be persisted.
864 // TODO(stuartmorgan): This is unnecessarily complicated; restructure loading
865 // of previous logs to not require running part of the upload logic.
866 // http://crbug.com/157337
867 if (!reporting_active()) {
868 scheduler_->Stop();
869 scheduler_->UploadCancelled();
870 return;
873 SendStagedLog();
876 void MetricsService::StageNewLog() {
877 if (log_manager_.has_staged_log())
878 return;
880 switch (state_) {
881 case INITIALIZED:
882 case INIT_TASK_SCHEDULED: // We should be further along by now.
883 NOTREACHED();
884 return;
886 case INIT_TASK_DONE:
887 PrepareInitialMetricsLog();
888 // Stage the first log, which could be a stability log (either one
889 // for created in this session or from a previous session) or the
890 // initial metrics log that was just created.
891 log_manager_.StageNextLogForUpload();
892 if (has_initial_stability_log_) {
893 // The initial stability log was just staged.
894 has_initial_stability_log_ = false;
895 state_ = SENDING_INITIAL_STABILITY_LOG;
896 } else {
897 state_ = SENDING_INITIAL_METRICS_LOG;
899 break;
901 case SENDING_OLD_LOGS:
902 NOTREACHED(); // Shouldn't be staging a new log during old log sending.
903 return;
905 case SENDING_CURRENT_LOGS:
906 CloseCurrentLog();
907 OpenNewLog();
908 log_manager_.StageNextLogForUpload();
909 break;
911 default:
912 NOTREACHED();
913 return;
916 DCHECK(log_manager_.has_staged_log());
919 bool MetricsService::ProvidersHaveStabilityMetrics() {
920 // Check whether any metrics provider has stability metrics.
921 for (size_t i = 0; i < metrics_providers_.size(); ++i) {
922 if (metrics_providers_[i]->HasStabilityMetrics())
923 return true;
926 return false;
929 void MetricsService::PrepareInitialStabilityLog() {
930 DCHECK_EQ(INITIALIZED, state_);
932 scoped_ptr<MetricsLog> initial_stability_log(
933 CreateLog(MetricsLog::INITIAL_STABILITY_LOG));
935 // Do not call NotifyOnDidCreateMetricsLog here because the stability
936 // log describes stats from the _previous_ session.
938 if (!initial_stability_log->LoadSavedEnvironmentFromPrefs())
939 return;
941 log_manager_.PauseCurrentLog();
942 log_manager_.BeginLoggingWithLog(initial_stability_log.Pass());
944 // Note: Some stability providers may record stability stats via histograms,
945 // so this call has to be after BeginLoggingWithLog().
946 log_manager_.current_log()->RecordStabilityMetrics(
947 metrics_providers_.get(), base::TimeDelta(), base::TimeDelta());
948 RecordCurrentStabilityHistograms();
950 // Note: RecordGeneralMetrics() intentionally not called since this log is for
951 // stability stats from a previous session only.
953 log_manager_.FinishCurrentLog();
954 log_manager_.ResumePausedLog();
956 // Store unsent logs, including the stability log that was just saved, so
957 // that they're not lost in case of a crash before upload time.
958 log_manager_.PersistUnsentLogs();
960 has_initial_stability_log_ = true;
963 void MetricsService::PrepareInitialMetricsLog() {
964 DCHECK(state_ == INIT_TASK_DONE || state_ == SENDING_INITIAL_STABILITY_LOG);
966 RecordCurrentEnvironment(initial_metrics_log_.get());
967 base::TimeDelta incremental_uptime;
968 base::TimeDelta uptime;
969 GetUptimes(local_state_, &incremental_uptime, &uptime);
971 // Histograms only get written to the current log, so make the new log current
972 // before writing them.
973 log_manager_.PauseCurrentLog();
974 log_manager_.BeginLoggingWithLog(initial_metrics_log_.Pass());
976 // Note: Some stability providers may record stability stats via histograms,
977 // so this call has to be after BeginLoggingWithLog().
978 MetricsLog* current_log = log_manager_.current_log();
979 current_log->RecordStabilityMetrics(metrics_providers_.get(),
980 base::TimeDelta(), base::TimeDelta());
981 current_log->RecordGeneralMetrics(metrics_providers_.get());
982 RecordCurrentHistograms();
984 log_manager_.FinishCurrentLog();
985 log_manager_.ResumePausedLog();
987 // Store unsent logs, including the initial log that was just saved, so
988 // that they're not lost in case of a crash before upload time.
989 log_manager_.PersistUnsentLogs();
992 void MetricsService::SendStagedLog() {
993 DCHECK(log_manager_.has_staged_log());
994 if (!log_manager_.has_staged_log())
995 return;
997 DCHECK(!log_upload_in_progress_);
998 log_upload_in_progress_ = true;
1000 if (!log_uploader_) {
1001 log_uploader_ = client_->CreateUploader(
1002 base::Bind(&MetricsService::OnLogUploadComplete,
1003 self_ptr_factory_.GetWeakPtr()));
1006 const std::string hash =
1007 base::HexEncode(log_manager_.staged_log_hash().data(),
1008 log_manager_.staged_log_hash().size());
1009 bool success = log_uploader_->UploadLog(log_manager_.staged_log(), hash);
1010 UMA_HISTOGRAM_BOOLEAN("UMA.UploadCreation", success);
1011 if (!success) {
1012 // Skip this upload and hope things work out next time.
1013 log_manager_.DiscardStagedLog();
1014 scheduler_->UploadCancelled();
1015 log_upload_in_progress_ = false;
1016 return;
1019 HandleIdleSinceLastTransmission(true);
1023 void MetricsService::OnLogUploadComplete(int response_code) {
1024 DCHECK(log_upload_in_progress_);
1025 log_upload_in_progress_ = false;
1027 // Log a histogram to track response success vs. failure rates.
1028 UMA_HISTOGRAM_ENUMERATION("UMA.UploadResponseStatus.Protobuf",
1029 ResponseCodeToStatus(response_code),
1030 NUM_RESPONSE_STATUSES);
1032 bool upload_succeeded = response_code == 200;
1034 // Provide boolean for error recovery (allow us to ignore response_code).
1035 bool discard_log = false;
1036 const size_t log_size = log_manager_.staged_log().length();
1037 if (upload_succeeded) {
1038 UMA_HISTOGRAM_COUNTS_10000("UMA.LogSize.OnSuccess", log_size / 1024);
1039 } else if (log_size > kUploadLogAvoidRetransmitSize) {
1040 UMA_HISTOGRAM_COUNTS("UMA.Large Rejected Log was Discarded",
1041 static_cast<int>(log_size));
1042 discard_log = true;
1043 } else if (response_code == 400) {
1044 // Bad syntax. Retransmission won't work.
1045 discard_log = true;
1048 if (upload_succeeded || discard_log) {
1049 log_manager_.DiscardStagedLog();
1050 // Store the updated list to disk now that the removed log is uploaded.
1051 log_manager_.PersistUnsentLogs();
1054 if (!log_manager_.has_staged_log()) {
1055 switch (state_) {
1056 case SENDING_INITIAL_STABILITY_LOG:
1057 // The initial metrics log is already in the queue of unsent logs.
1058 state_ = SENDING_OLD_LOGS;
1059 break;
1061 case SENDING_INITIAL_METRICS_LOG:
1062 state_ = log_manager_.has_unsent_logs() ? SENDING_OLD_LOGS
1063 : SENDING_CURRENT_LOGS;
1064 break;
1066 case SENDING_OLD_LOGS:
1067 if (!log_manager_.has_unsent_logs())
1068 state_ = SENDING_CURRENT_LOGS;
1069 break;
1071 case SENDING_CURRENT_LOGS:
1072 break;
1074 default:
1075 NOTREACHED();
1076 break;
1079 if (log_manager_.has_unsent_logs())
1080 DCHECK_LT(state_, SENDING_CURRENT_LOGS);
1083 // Error 400 indicates a problem with the log, not with the server, so
1084 // don't consider that a sign that the server is in trouble.
1085 bool server_is_healthy = upload_succeeded || response_code == 400;
1086 scheduler_->UploadFinished(server_is_healthy, log_manager_.has_unsent_logs());
1088 if (server_is_healthy)
1089 client_->OnLogUploadComplete();
1092 void MetricsService::IncrementPrefValue(const char* path) {
1093 int value = local_state_->GetInteger(path);
1094 local_state_->SetInteger(path, value + 1);
1097 void MetricsService::IncrementLongPrefsValue(const char* path) {
1098 int64 value = local_state_->GetInt64(path);
1099 local_state_->SetInt64(path, value + 1);
1102 bool MetricsService::UmaMetricsProperlyShutdown() {
1103 CHECK(clean_shutdown_status_ == CLEANLY_SHUTDOWN ||
1104 clean_shutdown_status_ == NEED_TO_SHUTDOWN);
1105 return clean_shutdown_status_ == CLEANLY_SHUTDOWN;
1108 void MetricsService::AddSyntheticTrialObserver(
1109 SyntheticTrialObserver* observer) {
1110 synthetic_trial_observer_list_.AddObserver(observer);
1111 if (!synthetic_trial_groups_.empty())
1112 observer->OnSyntheticTrialsChanged(synthetic_trial_groups_);
1115 void MetricsService::RemoveSyntheticTrialObserver(
1116 SyntheticTrialObserver* observer) {
1117 synthetic_trial_observer_list_.RemoveObserver(observer);
1120 void MetricsService::RegisterSyntheticFieldTrial(
1121 const SyntheticTrialGroup& trial) {
1122 for (size_t i = 0; i < synthetic_trial_groups_.size(); ++i) {
1123 if (synthetic_trial_groups_[i].id.name == trial.id.name) {
1124 if (synthetic_trial_groups_[i].id.group != trial.id.group) {
1125 synthetic_trial_groups_[i].id.group = trial.id.group;
1126 synthetic_trial_groups_[i].start_time = base::TimeTicks::Now();
1127 NotifySyntheticTrialObservers();
1129 return;
1133 SyntheticTrialGroup trial_group = trial;
1134 trial_group.start_time = base::TimeTicks::Now();
1135 synthetic_trial_groups_.push_back(trial_group);
1136 NotifySyntheticTrialObservers();
1139 void MetricsService::RegisterMetricsProvider(
1140 scoped_ptr<MetricsProvider> provider) {
1141 DCHECK_EQ(INITIALIZED, state_);
1142 metrics_providers_.push_back(provider.release());
1145 void MetricsService::CheckForClonedInstall(
1146 scoped_refptr<base::SingleThreadTaskRunner> task_runner) {
1147 state_manager_->CheckForClonedInstall(task_runner);
1150 void MetricsService::NotifySyntheticTrialObservers() {
1151 FOR_EACH_OBSERVER(SyntheticTrialObserver, synthetic_trial_observer_list_,
1152 OnSyntheticTrialsChanged(synthetic_trial_groups_));
1155 void MetricsService::GetCurrentSyntheticFieldTrials(
1156 std::vector<variations::ActiveGroupId>* synthetic_trials) {
1157 DCHECK(synthetic_trials);
1158 synthetic_trials->clear();
1159 const MetricsLog* current_log = log_manager_.current_log();
1160 for (size_t i = 0; i < synthetic_trial_groups_.size(); ++i) {
1161 if (synthetic_trial_groups_[i].start_time <= current_log->creation_time())
1162 synthetic_trials->push_back(synthetic_trial_groups_[i].id);
1166 scoped_ptr<MetricsLog> MetricsService::CreateLog(MetricsLog::LogType log_type) {
1167 return make_scoped_ptr(new MetricsLog(state_manager_->client_id(),
1168 session_id_,
1169 log_type,
1170 client_,
1171 local_state_));
1174 void MetricsService::RecordCurrentEnvironment(MetricsLog* log) {
1175 std::vector<variations::ActiveGroupId> synthetic_trials;
1176 GetCurrentSyntheticFieldTrials(&synthetic_trials);
1177 log->RecordEnvironment(metrics_providers_.get(), synthetic_trials,
1178 GetInstallDate());
1179 UMA_HISTOGRAM_COUNTS_100("UMA.SyntheticTrials.Count",
1180 synthetic_trials.size());
1183 void MetricsService::RecordCurrentHistograms() {
1184 DCHECK(log_manager_.current_log());
1185 histogram_snapshot_manager_.PrepareDeltas(
1186 base::Histogram::kNoFlags, base::Histogram::kUmaTargetedHistogramFlag);
1189 void MetricsService::RecordCurrentStabilityHistograms() {
1190 DCHECK(log_manager_.current_log());
1191 histogram_snapshot_manager_.PrepareDeltas(
1192 base::Histogram::kNoFlags, base::Histogram::kUmaStabilityHistogramFlag);
1195 void MetricsService::LogCleanShutdown() {
1196 // Redundant hack to write pref ASAP.
1197 MarkAppCleanShutdownAndCommit(&clean_exit_beacon_, local_state_);
1199 // Redundant setting to assure that we always reset this value at shutdown
1200 // (and that we don't use some alternate path, and not call LogCleanShutdown).
1201 clean_shutdown_status_ = CLEANLY_SHUTDOWN;
1203 clean_exit_beacon_.WriteBeaconValue(true);
1204 RecordCurrentState(local_state_);
1205 local_state_->SetInteger(prefs::kStabilityExecutionPhase,
1206 MetricsService::SHUTDOWN_COMPLETE);
1209 bool MetricsService::ShouldLogEvents() {
1210 // We simply don't log events to UMA if there is a single incognito
1211 // session visible. The problem is that we always notify using the orginal
1212 // profile in order to simplify notification processing.
1213 return !client_->IsOffTheRecordSessionActive();
1216 void MetricsService::RecordBooleanPrefValue(const char* path, bool value) {
1217 DCHECK(IsSingleThreaded());
1218 local_state_->SetBoolean(path, value);
1219 RecordCurrentState(local_state_);
1222 void MetricsService::RecordCurrentState(PrefService* pref) {
1223 pref->SetInt64(prefs::kStabilityLastTimestampSec,
1224 base::Time::Now().ToTimeT());
1227 void MetricsService::SetConnectionTypeCallback(
1228 base::Callback<void(bool*)> is_cellular_callback) {
1229 DCHECK(!scheduler_);
1230 is_cellular_callback_ = is_cellular_callback;
1233 } // namespace metrics