ConnectionManager: Disable hard limit in favor of twiddled hammering values.
[thrasher.git] / perl / lib / Thrasher / Backend.pm
blobdd5578b17627960271f462701b44f6e4ba3b649e
1 package Thrasher::Backend;
2 use strict;
3 use warnings;
5 use Thrasher::Nodeprep qw(:all);
6 use Thrasher::Roster qw(:constants);
7 use Thrasher::Log qw(log);
9 use Carp qw(confess longmess);
11 =head1 NAME
13 Thrasher::Backend - base class for storage backends
15 =head1 DESCRIPTION
17 Thrasher::Backed abstracts out the backend of the
18 transport. Transports need to retain a small amount of state
19 per user, but this state could be stored in a wide variety of
20 ways. Thrasher Bird provides some default implementations,
21 including ways reverse-compatible with other transports, and
22 by implementing a new subclass of Thrasher::Backend, you can
23 provide your own.
25 Backends are responsible for storing the following information:
27 =over 4
29 =item *
31 The registration information, which is at a minimum a
32 legacy ID of some kind and the password information for that ID.
34 =item *
36 The mapping of IDs on the legacy service to JIDs, which
37 must be readable from both directions (JID -> legacy and
38 legacy -> JID). This must, in both directions, be accessed via
39 utf-8 encoded strings and no other encoding.
41 =item *
43 The storage of a small, arbitrary hash of strings to
44 strings, which will likely be empty.
46 The backend may encode this in any way it chooses. Optionally, it may
47 limit (such as hard-coding certain keys and forbidding all else),
48 or even fail to provide this service, at the cost of not working
49 with protocols that require it. (Not all protocols do.)
51 This feature may be removed if we end up not needing it for any
52 protocol.
54 =item *
56 The list of legacy users on the user's roster.
58 This is needed to compare the roster we get when we log on
59 to the list of users we remember from last time. Users with a
60 changed status will have those changes propogated to the user
61 when they next connect.
63 =back
65 Each of things should be accessible by using the JID of the user
66 that is using the transport.
68 A normal backend will implement the following:
70 =over 4
72 =item *
74 C<new>: To set up any necessary database connections or
75 what ever.
77 =item *
79 C<registered>
81 =item *
83 C<register>
85 =item *
87 C<remove>
89 =item *
91 C<jid_to_legacy_name>
93 =item *
95 C<retrieve_legacy_name_to_jid>
97 =item *
99 C<store_username_mapping>
101 =item *
103 C<get_roster>
105 =item *
107 C<set_roster>
109 =item *
111 C<set_roster_user_state>
113 =item *
115 C<get_roster_user_state>
117 =item *
119 C<all_jids> (only needed to be a source for migration)
121 =back
123 =head1 INTERFACE
125 The Thrasher::Backend interface is as follows:
127 =over 4
129 =item *
131 C<new>($configuration_hash) - The configuration hash for the backend,
132 as provided by the user in the configuration file.
134 =cut
136 sub new {
137 my $class = shift;
138 my $self = {};
139 bless $self, $class;
140 return $self;
143 =pod
145 =item *
147 C<registered>($jid) - If the user is registered, should return
148 a hashref containing the users username, password, and any other
149 relevant fields for the protocol. (Note not all protocols are limited
150 to those two fields.) If they are not registered, should return
151 C<undef>.
153 =cut
155 sub registered {
156 my $self = shift;
157 die "registered method not implemented in " . ref($self);
160 =pod
162 =item *
164 C<register>($jid, $hashref) - The hash ref gives all relevant
165 registration information for this protocol. Again, it may be more
166 than just "username" and "password".
168 =cut
170 sub register {
171 my $self = shift;
172 die "register method not implemented in " . ref($self);
175 =pod
177 =item *
179 C<remove>($jid) - The user identified by the given JID is
180 being removed from the transport. The JID will not have a resource
181 on it. The user's data should be removed as well.
183 =cut
185 sub remove {
186 my $self = shift;
187 die "remove method not implemented in " . ref($self);
190 =pod
192 =item *
194 C<legacy_name_to_jid>($user_jid, $legacy_username, $component_name, $lang):
195 Returns the JID associated with the $legacy_username, encoded in
196 UTF-8, for the user identified by C<$jid>, according to the following
197 rules:
199 =over 4
201 =item *
203 Once a JID has been returned for a given $legacy_username
204 to a given user, the same JID must always be returned to that user,
205 subject to the following exception: If the userr removes themself
206 from the transport, the JID may be different if they re-register
207 at a later time, but must then be the same until the next removal.
209 =item *
211 The JID returned for a given $legacy_username must be unique.
212 If your normalization rules map two distinct legacy IDs to the same
213 JID, they I<must> be disambiguated somehow.
215 =item *
217 The JID returned absolutely I<must> conform to the nodeprep
218 and stringprep rules for JIDs and hosts, as described in the RFC.
219 See Thrasher::Nodeprep for functions to help with this.
221 =back
223 This is a template method that calls out to the following routines
224 to do its work:
226 =over 4
228 =item *
230 C<store_username_mapping>
232 =item *
234 C<jid_to_legacy_name> (used by C<unduplicate>)
236 =item *
238 C<normalize_name>
240 =item *
242 C<unduplicate> (has a useful default implementation)
244 =back
246 You probably don't have to modify this method, just implement the
247 above as desired.
249 $component_name is assumed to already conform with the JID rules for
250 a server name, or Thrasher will die long before it gets here.
252 $lang is currently unused, but gives implementers the possibility to
253 meaningfully override the unification process for different languages.
254 (However, that should probably be done by modifying the code
255 in this file, not in your override.)
257 =cut
259 sub legacy_name_to_jid {
260 my $self = shift;
261 my $user_jid = shift;
262 my $legacy_username = shift;
263 my $component_name = shift;
264 my $lang = shift;
266 # During testing, this won't be true.
267 if ($self->{component} && $self->{component}->{protocol}) {
268 $legacy_username = $self->{component}->{protocol}->process_remote_username($legacy_username);
271 my $current = $self->retrieve_legacy_name_to_jid
272 ($user_jid, $legacy_username);
273 if (defined($current)) {
274 return $current;
277 my $default_mapping = $self->normalize_name($legacy_username);
278 my $default_jid = $default_mapping . '@' . $component_name;
280 my $actual_mapping = $self->uniquify_name
281 ($user_jid, $default_jid, $lang);
283 $self->store_username_mapping($user_jid, $legacy_username,
284 $actual_mapping);
285 return $actual_mapping;
288 =pod
290 =item *
292 C<retrieve_legacy_name_to_jid>($user_jid, $legacy_username):
293 This directly queries the data store to return what the JID for
294 the $legacy_username is, or returns undef if there is no
295 mapping. The component will always use C<legacy_name_to_jid>,
296 which uses what this function returns if it is defined, or
297 otherwise takes care of creating and remembering a mapping.
299 There is no default implementation for this.
301 =cut
303 sub retrieve_legacy_name_to_jid {
304 my $self = shift;
306 die "Method retrieve_legacy_name_to_jid not implemented in "
307 .ref($self);
310 =pod
312 =item *
314 C<jid_to_legacy_name>($user_jid, $target_jid): Returns the legacy name
315 associated with the $target_jid, as seen in C<legacy_name_to_jid>, for
316 transport user C<$user_jid>.
318 As much as I'd like to be a tight-ass about jids we've never seen
319 before, that doesn't seem to be in the cards since so many clients
320 will just toss out JIDs willy-nilly. If you don't have a remembered
321 JID, you are required to try to fake one up.
323 =cut
325 sub jid_to_legacy_name {
326 my $self = shift;
327 die "jid_to_legacy_name not implemented in " . ref($self);
330 sub fake_up_a_legacy_name {
331 my $self = shift;
332 my $user_jid = shift;
333 my $jid = shift;
335 my $legacy_guess = $jid;
336 $legacy_guess =~ s/\@.*$//;
337 # Extra transforms to match those in nodeprep:
338 $legacy_guess =~ s/\%/\@/;
340 if ($self->{'component'} && $self->{'component'}->{'protocol'}) {
341 $legacy_guess = $self->{'component'}->{'protocol'}->fake_up_a_legacy_name($user_jid, $jid, $legacy_guess);
344 # Store it so we don't stomp on it later.
345 $self->store_username_mapping($user_jid, $legacy_guess, $jid);
346 return $legacy_guess;
349 =pod
351 =item *
353 C<jid_has_legacy_name>($user_jid, $target_jid): Returns true if
354 the target JID already has a legacy name, false otherwise.
355 Used for uniquing.
357 =cut
359 sub jid_has_legacy_name {
360 my $self = shift;
361 die "jid_has_legacy_name not implemented in " . ref($self);
364 =pod
366 =item *
368 C<normalize_name>($legacy_name): This provides
369 an implemention of a default normalization scheme, using
370 Thrasher::Nodeprep.
372 You likely don't need to override this.
374 =cut
376 # FIXME: Does this belong with the Protocol, or even in a separate
377 # library entirely? This belongs with Uniquify, but that may
378 # not belong here either.
379 sub normalize_name {
380 my $self = shift;
381 my $legacy_name = shift;
383 if (!defined($legacy_name)) {
384 confess "Being asked to normalize 'undef'.";
387 # If the stringprep routine results in an empty string,
388 # at least return something usable as a JID. This should
389 # really never ever happen.
390 my $normalized = nodeprep($legacy_name) || 'unknown';
391 return $normalized;
394 =pod
396 =item *
398 C<store_username_mapping>($user_jid, $legacy_username, $mapped_jid):
399 The backend has decided on the mapping for the legacy username
400 for the given user, and the mapping needs to be stored. For the
401 user identified by $user_jid, store that the $legacy_username maps
402 to $mapped_jid, such that all future calls to C<legacy_name_to_jid>
403 will return that mapped jid, and C<jid_to_legacy_name> will return
404 the $legacy_username.
406 =cut
408 sub store_username_mapping {
409 my $self = shift;
410 die "store_username_mapping unimplemented for " . ref($self);
413 =pod
415 =item *
417 C<uniquify_name>($user_jid, $possible_legacy_jid, $lang): Given the
418 username mappings for $user_jid, return a JID username based
419 on $possible_legacy_jid that does not return a value when passed
420 to C<jid_to_legacy_name>, which is to say, it has not been
421 assigned yet.
423 This default method ignores language for now, and simply starts
424 appending numbers on the end until something unique is found. However,
425 it should be very, very unusual for unique legacy service names to
426 map to the some $possible_legacy_jid; this is I<major> paranoia
427 on Thrasher's part.
429 =cut
431 sub uniquify_name {
432 my $self = shift;
433 my $user_jid = shift;
434 my $possible_legacy_id = shift;
435 my $lang = shift;
437 if (!$self->jid_has_legacy_name($user_jid, $possible_legacy_id)) {
438 return $possible_legacy_id;
441 my ($username, $server) = split /\@/, $possible_legacy_id;
443 # More major paranoia; give up after 100, something's wrong;
444 # this can happen during debugging
445 for (my $i = 2; $i < 100; $i++) {
446 my $candidate_jid = $username . $i . '@' . $server;
447 if (!$self->jid_has_legacy_name($user_jid, $candidate_jid)) {
448 return $candidate_jid;
452 die "In uniquify_name, after 98 attempts, couldn't find a "
453 ."JID to assign for '$possible_legacy_id'.";
456 =pod
458 =item *
460 C<set_avatar>($user_jid, $legacy_username, $avatar_png_data_base64):
461 The legacy user has set the given avatar for a given legacy_username
462 for the given $user_jid. vcard and PEPAvatar will use this. This must
463 fire the "avatar_changed" callback. The avatar should be in PNG
464 format, and encoded in base64 already.
466 You need to store this such that C<get_avatar> can retreive the base64
467 encoding of the avatar.
469 =cut
471 sub set_avatar {
472 my $self = shift;
473 die "set_avatar not implemented in " . ref($self);
476 =pod
478 =item *
479 C<get_avatar>($user_jid, $legacy_username): Returns the base64 avatar
480 information from a corresponding C<set_avatar>.
482 =cut
484 sub get_avatar {
485 my $self = shift;
486 die "get_avatar not implemented in " . ref($self);
489 =pod
491 =item *
493 C<all_avatars>($jid): Gets a hash mapping all legacy names
494 to their avatar value, for migration purposes.
496 =cut
498 sub all_avatars {
499 my $self = shift;
500 die "all_avatars not implemented in " . ref($self);
503 =pod
505 =item *
507 C<get_roster>($user_jid): Returns the roster for the given
508 user, as a hash where the legacy name is the key and the
509 type of roster entry is the value, where the value is chosed from
510 $self->subscribed or $self->want_subscribe. (These are numeric
511 constants given methods for convenience.) Rosters should not
512 contain $self->unsubscribed, as that is the assumed default state.
514 If there is no roster, this function should return an empty hash, not
515 undef.
517 Note that for the following few functions, the "roster" only
518 pertains to what is stored in the Backend. "Clearing the roster",
519 for instance, does nothing to the user's roster.
521 This has no default implementation.
523 =cut
525 sub get_roster {
526 my $self = shift;
527 die "get_roster is not implemented in " . ref($self);
530 =pod
532 =item *
534 C<set_roster>($user_jid, $roster): Sets the roster to the given
535 roster hash, with the C<$roster> hash in the format as described
536 in C<get_roster>. This should empty out any current settings and
537 result in the backend I<only> having the roster as described in
538 C<$roster>.
540 This has no default implementation.
542 =cut
544 sub set_roster {
545 my $self = shift;
546 die "set_roster is not implemented in " . ref($self);
549 =pod
551 =item *
553 C<set_roster_user_state>($user_jid, $legacy_username, $state):
554 Sets the roster to contain the given C<$legacy_username> in the
555 given C<$state>, where C<$state> is one of C<$self->subscribed>,
556 C<$self->unsubscribed>, or C<$self->want_subscribe>.
558 In the case that a state is set to C<$self->unsubscribed>, it
559 should actually be entirely removed from the roster, and not
560 reported as part of the roster.
562 This has no default implementation.
564 =cut
566 sub set_roster_user_state {
567 my $self = shift;
568 die "set_roster_user_state is not implemented in " . ref($self);
571 =pod
573 =item *
575 C<get_roster_user_state>($user_jid, $legacy_username):
576 Gets the roster state, which is one of constants defined in
577 Thrasher::Roster. The implementation should always return one
578 of those three constants, ||= $self->unsubscribed if necessary.
580 This is used upon initial registration; the legacy service sends the
581 presence of the remote users immediately, but we can not correctly
582 propagate that until they have been subscribed to.
584 =cut
586 sub get_roster_user_state {
587 my $self = shift;
588 die "get_roster_user_state is not implemented in " . ref($self);
591 =pod
593 =item *
595 C<all_jids>(): Returns an arrayref of all JIDs currently in
596 the system. Used for the migrator.
598 =cut
600 sub all_jids {
601 my $self = shift;
602 die "all_jids is not implemented in " . ref($self);
605 =pod
607 =item *
609 C<all_mappings>($user_jid): Returns a hash ref of all
610 currently-existing mappings, for migration purposes. The
611 keys are the legacy names and the values are the JIDs.
613 =cut
615 sub all_mappings {
616 my $self = shift;
617 die "all_mappings is not implemented in " . ref($self);
620 =pod
622 =item *
624 C<all_misc>($user_jid): Returns a hash ref of all
625 "misc" values, for migration purposes.
627 =cut
629 sub all_misc {
630 my $self = shift;
631 die "all_misc is not implemented in " . ref($self);
634 =pod
636 =item *
638 C<set_misc>($user_jid, $key, $value): Sets a 'misc'
639 key/value. If your backend doesn't support this, don't
640 implement it, as scripts may look at the standard die
641 message to determine this fact.
643 =cut
645 sub set_misc {
646 my $self = shift;
647 die "set_misc is not implemented in " . ref($self);
650 =pod
652 =item *
654 C<get_misc>($user_jid, $key): Gets a 'misc' value.
656 =cut
658 sub get_misc {
659 my $self = shift;
660 die "get_misc is not implemented in " . ref($self);
663 =pod
665 =item *
667 C<register_protocol>($protocol): Registers the protocol
668 being used with this backend, which permits backends to
669 react to protocols as needed. Particularly useful may be
670 $protocol->registration_items. The default implementation
671 does nothing.
673 =cut
675 sub register_protocol { }
677 =pod
679 =back
681 =cut
683 # Undocumented: clear_backend completely destroys all data
684 # in the current backend. Used for testing purposes, to
685 # reset the backend to a known state before proceding with
686 # testing. Generally, this is expected to also destroy all
687 # state that Thrasher Bird understands.
688 sub clear_backend {
689 my $self = shift;
690 die "clear_backend is not implemented in " . ref($self);