Okay, my own input on "Amazonfail" here.
As a cataloger myself, I can tell you that it is possible for a technical screw-up to cause the observed result. Without knowing more about Amazon's database I could not tell you how, precisely. (Even with knowing some more, it would probably be obscure to me as I doubt they are running Unicorn, so what I am about to describe is a hypothetical situation based on my own experiences.) On our own it would be something like "misconfigure a report that globally does something to subject heading indexes", which is easy to do if you are not scrupulously careful because there are lots of fiddly little options on many reports even in the GUI; those who venture to home-cook API scripts without first consulting Customer Care, beware!
I don't know if this was necessarily subject-category related (it could be some kind of invisible internal tagging Amazon applies, for instance), but if it was, since many books have multiple subject headings it is plausible to me that something some programmer did somewhere affected only certain titles that had the right "lucky" one or two, even while they happened to share subjects three and four or whatever with other titles, so the effect was that only part of the pool was masked.
NB I am not saying there is no problem here, merely that I think remarks like "nuh-uh, it was awfully selective to be just a mistake!" are underinformed about what is possible to occur with massive book databases.
As a cataloger myself, I can tell you that it is possible for a technical screw-up to cause the observed result. Without knowing more about Amazon's database I could not tell you how, precisely. (Even with knowing some more, it would probably be obscure to me as I doubt they are running Unicorn, so what I am about to describe is a hypothetical situation based on my own experiences.) On our own it would be something like "misconfigure a report that globally does something to subject heading indexes", which is easy to do if you are not scrupulously careful because there are lots of fiddly little options on many reports even in the GUI; those who venture to home-cook API scripts without first consulting Customer Care, beware!
I don't know if this was necessarily subject-category related (it could be some kind of invisible internal tagging Amazon applies, for instance), but if it was, since many books have multiple subject headings it is plausible to me that something some programmer did somewhere affected only certain titles that had the right "lucky" one or two, even while they happened to share subjects three and four or whatever with other titles, so the effect was that only part of the pool was masked.
NB I am not saying there is no problem here, merely that I think remarks like "nuh-uh, it was awfully selective to be just a mistake!" are underinformed about what is possible to occur with massive book databases.
no subject
Date: April 16th, 2009 01:31 am (UTC)From:And yes, it's also more than a little disturbing that the top result on searching "homosexuality" is a book on how to "prevent" homosexuality in one's children. (As if such a thing were possible.) Got Homophobia?
NB: I am the single mother of a grown son, and the only thing I did to influence his sexual orientation (which turned out to be straight) was to assure him at the age of 15 that it's normal for a 15 year old to be horny all the time. ;-) Don't tell him I told you that.
no subject
Date: April 16th, 2009 01:53 am (UTC)From:The ranking system is basically telling you what others thought was best, not what Amazon Inc. does personally (corporately). It is aggregate data, probably somewhat mysteriously arrived at via combining a bunch of factors, like Google's PageRank. But aside from that, in this case it sounded to me like the ranking values in some books' records were either being blanked out or ignored, so they wouldn't show because the rank was null as far as the page generating code was concerned. (You'd think something with "zero" rank would simply show at the bottom, but it's possible that it was really "null" or something else the code would consider corrupt, and therefore skip the records entirely.) That such a thing could happen is perhaps disturbing (certainly frustrating), but I don't think this is designed to be a filtering process used as such on purpose, rather a side effect of certain data being erased or ignored.
no subject
Date: April 16th, 2009 02:00 am (UTC)From:Does that make sense? I'm on my way to work, so I'm thinking and writing in a hurry. ;-)
no subject
Date: April 16th, 2009 02:34 am (UTC)From:I don't think any person or group "chose" that in the active sense. My speculation is that what did the "sifting" was entirely automatic: the code which writes search-result pages saw these items had no rank, or corrupt rank, or something like that, and so dropped them from results lists. (I seem to recall people saying that you could still locate things by searching by title.)
I don't mean there is a value judgement going on ("books with no ranking are not results anyone wants"), rather something more like error-checking: if part of the record does not match the robot's expected formatting, it may be instructed to drop the whole record as a bad job, rather than insert bad data which might do anything from appear poorly formatted onscreen to causing the program to crash.
Obviously I have not deconstructed Amazon's back end, but my programmer husband tells me this is a plausible scenario, if not verifiable in this case.
As to how that rank data problem occurred in the first place (what you are terming "sifting some books out of the sales-ranking factor", i.e. removing them from a possible pool), that's a separate question. I think the explanation that the books got excluded did so by accident is plausible, such as with the scenario I described from cataloging experience where something semi-automated that alters field B based on what's in fields A and C is mis-configured and yields results you did not intend. It is of course also possible it was the evil intent people ascribed at first (akin to "let's make sure these kinds of books don't get any rank"), or personal malice of a person who can alter this sort of data (an inside job vs. an external hack, but not something sanctioned by Amazon).
I was not trying to offer a complete solution to how this obviously was all by accident, simply contesting the single notion of "it was too selective for that to be possible". The apparent selectivity could be a side effect of the specific internal structure of the relational database. Poking part of it in a certain way could produce this result without someone having specifically intended it.
no subject
Date: April 16th, 2009 02:41 am (UTC)From:no subject
Date: April 16th, 2009 03:30 am (UTC)From:Whether they got hacked (one theory), there was a miscommunication (as described in one article) between programmers and policy, or it was a bad program.
Also, all these people need to STFU because it wasn't just "this" subject or "that" subject. Amazon has said over 52,000 books were effected by this. A bunch of people are under the impression that it was only gay/lesbian and erotic. It was much more than that. And, frankly, Amazon, like many search engines, may need to make a safe search soon enough.
no subject
Date: April 16th, 2009 12:52 pm (UTC)From:no subject
Date: April 16th, 2009 07:01 pm (UTC)From:Seriously. More than you want to know there, I'm sure.