Incident Report X040901
Degraded Blogger Feed

orcmid>
sostegno>

X040901>
0.10 2004-09-17 -16:49 -0700


Category: Incorrect Data - Degraded Functionality Incident ID: X040901
Priority: 5 - Annoying
Status: Picking Up Loose Ends
Subject: Atom Site Feeds, Blogger 5.15 Repaired in: 2004-09-09 weekly build
Assigned To: Dennis Hamilton (analysis) Reported By: 
Dennis Hamilton (2004-09-04)
Date Opened: 2004-09-04
     2004-09-07 on Blogger Support
Date Closed: 2004-09-10

1. Summary
2. Remedies
3. Actions
4. Analysis
5. Lessons Learned

see also:
X040702: Blogger FTP Corruption
B040801: Blog Slamdown Procedure
"Your%20Message%20Here" article
"Honey, Where'd You Put the Bloggo?"
"A Feed Too Far"

1. Summary (2004-09-10):

Incident

Orcmid's Lair Site Feed Degradation (X040901)

2004-09-04

When my 09-03 solitary posting was created, Blogger generated a bogus site feed that provides incorrect links for the articles.  Having made a template change that went in effect at the same time, I am not sure what the contributory karma is.  I do notice, however, that the same thing just happened on the Google Blog and that it happened with Nancy White's Blog on her last post Thursday evening.  There is no lockdown, but I have declared an incident on the Atom feed at Orcmid's Lair and I am not posting anywhere else until this is resolved enough.
   [dh: 2004-09-05T08:16Z The change in template is eliminated as the cause.  The Atom feed element that has changed is entirely controlled by Blogger.com.  This change apparently occured sometime on Thursday, September 2 and it should effect the feeds of all Blogger.com-intermediated blogs that have had new postings since them. 
There is no indication that this is a known problem.]
   [dh:2004-09-06T18:00Z The degradation is still present and there is no indication that this is a known problem at the Blogger site.  I have turned in a support message and pointed to the Google Blog and my lastes notes on Incident X040901.]
   [dh:2004-09-10T15:11Z The Atom feeds now seem to be operating as before, based on the results for this posting, last night.  This incident was reported on the Blogger Support page on 2004-09-07 and I managed to miss it somehow.  So the cool thing is that a systematic process seems to be in place and working [and Blogger is used in the support process.]
  [dh:2004-09-11T05:55Z The feeds are being generated correctly and I have provoked recreation of as much of the feeds as I am able.]

Sometime on Thursday, September 2, Blogger.com changed the way that it links to articles from the Atom feed.  Instead of providing the permalink to the post of the article, the site feed is now providing the URL of the archive page (not even the article within the archive page, which would be workable).

On Tuesday, September 7, the incident was recognized on the Blogger Support page and my first feed update on Thursday, September 9, seems to be working.

I am closing out this incident Friday, September 10, with only loose ends to be cleaned up in the documentation for possible future reference..

There is more provisional material here until the analysis and actions are formulated better.

2. Remedies (2004-09-10):

1. Put a post on Orcmid's Lair to the effect that we are proceeding under degraded operation.  Do this before any new posts on each feed, so that subscribers will see the notice in the feed and on the site.  These will stay up until there is an all-clear on the situation, or it is clear that the situation will not be remedied.

2. Notified Google support of the incident, where they can find out more, and document that this does not seem to be correct according to the Atom 0.3 specification that they identify their feed as conforming to.  [I need to double-check all of that.]

3. Wait for the repair to be available.  Repost my feed if appropriate, so subscribers will have proper permalinks in their aggregated materials.  I could have manually corrected the feed at once, and I didn't think to do that.

4. Trigger regeneration of the feeds after Blogger deploys a corrected version of the feed generator.

3. Actions (as of 2004-09-10):

4. Analysis (2004-09-10):

[replace with more stuff]

Nancy's Blog is at http://www.fullcirc.com/weblog/onfacblog.htm

Google's Blog is at http://www.google.com/googleblog/

I also noticed the problem when there was a new post on LogBlob. 

I have an initial summary of the situation here.

5. Lessons Learned (tbd as of 2004-09-10):

[notes to be developed further]

When the red lights go on [reference to Ross Anderson passage about that.]

Parts of the slam-down that worked, other parts that were pokey

Remembering to capture the data, including the evidence of difference that confirm the incident.  Do this more quickly.  The feeds demonstrating that the incident applied to other sites 

Example of feed differences important to achieve.  Archiving threat tree needed and it leaves the feed exposed unless I back it up more often than the site.  Separate incident analysis?

Example of aggregator presentation differences

Identification of version changes in feeds so that trouble-shooting can be matched against change levels - lesson for Blogger, lesson for me

Identification of version of Atom is fine, and there needs to be identification of an implementation profile somehow, and access to that profile as well - lesson for me as a developer.  [2004-09-09-09:04 On my end, there needs to be identification of the specific versions of sofware used.  That's not so important this time but it matters in other cases.  Look at what it takes to know configurations and maintain that information easily.]

Don't slip stream something that has potential to surprise people.  Don't slip stream something that may impact the boss or embarrass the management and then go away for a long weekend! [I really do need to implement a manual feed separate from the Wingnut site feed for experimentation with the particular case of feed-variation sensitivity in various aggregators.]

There's a big lesson for me in being attentive to when I am operating in the solution space and have lost sight of the problem space and the impact of solutions on the problem space.  Importance of change management review and some provisions for testing changes.  I don't know what solitary practices can help here - it really requires a checklist and multiple eye-balls and thinking to be pulled back up to a solution perspective, I think.  [Take some of this over to the note on interaction design and Alan Coopers trade classifications.]

[dh:2004-09-06-10:11 There is also something to be said about interoperability, established expectations, and whatever any current specifications might or might not say on the matter.]

[dh:2004-09-10-09:39 I suppose it was all right to wait until the next build to put in the fix, although it makes more work for me to ensure that my feed entries out there are corrected (as much as I can handled that from this end).  A lesson for me is that (1) there must be regression and (2) there always needs to be a way to roll-back any change that is pushed-out.  This leads to (3) there being problems about changes tied to builds where a roll-back may also remove a critical fix as well as undo an introduced problem.  I don't have an answer for that, though it looks like an architectural question as much as a system engineering one.]

[dh:2004-09-10-22:32 There is no good way to tell that a feed update is one that will drive out bad entries and replace them with improved ones.  It is necessary to examine the feeds themselves, as XML, and also see what happens in practice.  This is an area where different testing is needed, along with some kind of regression methodology.]


0.10 2004-09-07-08:44 Awaiting Resolution
There is enoug information here and in related posts for this situation to be recognized and resolved at Blogger.com.  I will tidy up from here, now that the incident is fully-captured.
0.00 2004-09-04-11:02 Create Initial Incident Identification, Analysis, and Proposed Actions
Capture enough material so that other troubleshooting can proceed and this report can be polished later.  Manage with a  job jar.

Construction Zone (Hard Hat Area)

You are navigating Orcmid's Lair

created 2004-09-04-11:02 -0700 (pdt) by orcmid
$$Author: Orcmid $
$$Date: 07-02-16 16:53 $
$$Revision: 18 $

Home