arcanetrivia | (Reply)

What Amazon is doing, as I understand it, is choosing to filter certain titles, based on certain (not relevant to a straightforward search) criteria, out of the rank-based results. [...] what I'm objecting to is that they are sifting some books out of the sales-ranking factor in the first place.

I don't think any person or group "chose" that in the active sense. My speculation is that what did the "sifting" was entirely automatic: the code which writes search-result pages saw these items had no rank, or corrupt rank, or something like that, and so dropped them from results lists. (I seem to recall people saying that you could still locate things by searching by title.)

I don't mean there is a value judgement going on ("books with no ranking are not results anyone wants"), rather something more like error-checking: if part of the record does not match the robot's expected formatting, it may be instructed to drop the whole record as a bad job, rather than insert bad data which might do anything from appear poorly formatted onscreen to causing the program to crash.

Obviously I have not deconstructed Amazon's back end, but my programmer husband tells me this is a plausible scenario, if not verifiable in this case.

As to how that rank data problem occurred in the first place (what you are terming "sifting some books out of the sales-ranking factor", i.e. removing them from a possible pool), that's a separate question. I think the explanation that the books got excluded did so by accident is plausible, such as with the scenario I described from cataloging experience where something semi-automated that alters field B based on what's in fields A and C is mis-configured and yields results you did not intend. It is of course also possible it was the evil intent people ascribed at first (akin to "let's make sure these kinds of books don't get any rank"), or personal malice of a person who can alter this sort of data (an inside job vs. an external hack, but not something sanctioned by Amazon).

I was not trying to offer a complete solution to how this obviously was all by accident, simply contesting the single notion of "it was too selective for that to be possible". The apparent selectivity could be a side effect of the specific internal structure of the relational database. Poking part of it in a certain way could produce this result without someone having specifically intended it.