If You’re Interested in Archives…

I just heard about this court case, figured I’d pass the news along in case it was of interest.

Full disclosure: I’ve never borrowed books from Archive.org. What I have checked out on their site is their very large section of archived old foreign shows. They have, among other things, about 80 eps of Unfettered Shogun. That is an awesome thing to have preserved.

They’re also apparently a major archive of sources used on Wikipedia, and anyone who does research via the internet has reason to want that intact. So I have a vested interest in them not getting their heads handed to them by a bunch of publishing companies.

I have to admit I’m cranky that this seems to be coming up at all. I have run across a lot of old history books in university libraries and the like, and I know just how lucky I’ve been to have access to them. And how much I wail and gnash teeth trying to find other old history books I want to read that… are either no longer in existence, or you might find one copy out there for $300+. That information is a cultural resource. We’ve had enough culture destroyed and deliberately forgotten. Anything that preserves old books for future use should be encouraged.

But I’m not a legal expert. Feel free to check this out for yourself.



13 thoughts on “If You’re Interested in Archives…

  1. Oh, I see why the publishers want this destroyed– right now, they license digital-only checkouts that expire after a ludicrously low number of checkouts, so you then have to “replace” the copy. At full hardcopy price.

    If a library can check out a book they physically own, in digital form, even with the limit of actually having to have the same number of copies as they do checkouts– especially for books where the publishers NEVER GOT THE DIGITAL RIGHTS because ebooks didn’t exist– that guts their ability to gouge libraries.

    Liked by 5 people

      1. It’s always possible it’s just another round of “those darn customers, if they’d just stop using ebooks we’d be making a lot of money again. It’s totally not anything we’re doing, it’s All Their Fault.”

        With a side of the popular “libraries are theft” thing.

        Liked by 2 people

  2. Reblogged this on Head Noises and commented:
    The Big Four (I think there are still four? They keep merging/buying each other) are trying to take out the Internet Archive.

    For bonus “fun,” if these books are old enough– the publishers never bought the digital rights. They weren’t in the contracts back before ebooks existed.

    Liked by 3 people

      1. The Big Four dominate hard-copy fiction because they dominate *ALL* the publishing; the textbook market tends to pay most of the bills, followed by kid’s books. The kind of thing where the person reading them isn’t the customer.
        (For folks curious, here’s a rundown on that drama: https://www.publishersweekly.com/pw/by-topic/industry-news/publisher-news/article/89038-over-the-past-25-years-the-big-publishers-got-bigger-and-fewer.html )

        There are smaller– I think they’re usually called “academic presses” as a category?– that may or may not be distributed by the major publishers, I know that the smaller fiction publishers in the US are generally distributed by one of the Big Four. For example, Baen is distributed by Simon & Schuster.)

        The John Wiley publishing company– now knowing as Wiley, but good luck finding anything if you only have that, although it’s also the website address 😀 — self-identifies as academic, scientific and professional books and journals, and Wiki says they bought the Idiot’s Guide To series.

        It looks like they aren’t distributed by anybody, which is nice.

        Hatchett, of the lawsuit, is one of the Big Four.

        Liked by 2 people

      2. Yeah, three of the four Big Four, if my recollection there is correct. I may also owe an apology here for being narrowly pedantic.

        Discovery may prove interesting in this.

        Last industry lawsuit I recall, over the sale of a publishing company, had some interesting things show up in discovery.

        Of course, the chutzpah of Apple and the Big I-forget-however-many suing Amazon sticks in my memory.

        Textbook companies are a bit weird.

        I’m not sure which imprints are distributed by whom, etc.

        I think I had checked John Wiley recently, or something.

        Springer is a very mature textbook and journal publishing company.

        Dover is actively publishing a bunch of old textbook titles, but their business model is charging prices for paperbacks that range from being a bargain to being very reasonable.

        I’ve looked into the EFF’s claims about the case, and if true I am pretty angry at these publishers.

        Liked by 2 people

    1. Hence why I tagged with “cranky.” Because this is one of those things that makes me Very Cranky Indeed.

      I write fantasy, yes. But I want it grounded as much as possible in actual science, folklore, medicine, history, etc.

      That is becoming increasingly difficult. Which robs our culture even more. We don’t just need this information for “science!” We need it for the stories that keep us alive.

      Liked by 2 people

      1. Yes! That’s exactly why we need this resource!

        Stupid publishers! There wasn’t such a thing as digital right when the internet first became a thing we all had access to. …I don’t think it was anyway. But, the fact remains: If they get rid of Archive, they only hurt the people they do business with.

        Maybe not the textbook writers, but fiction writers do!

        Liked by 1 person

  3. Yeah, not a legal expert.

    Merits of the case, I dunno.

    So, prior to 192X, I forget, is in the public domain. Unless a renewal period did not have the paperwork filed, in which case we are talking soemthing like the 1970s.

    This has some implications wrt ‘body of human knowledge’, and citations.

    In theory, we have X new scientific publications per year, and these are adding to sum total human knowledge. In practice, some of these are fraud, careful lies, or otherwise false in a way that is difficult to sort out. But, the issue here is the access challenges.

    A PhD requires a project, part of which is a literature review to verify that this work is new. That so much journals and papers are paywalled is not necessarily a problem, because university library, and access to digital collections. There are a lot of universities, each library has a different incomplete level of access that they have bought to the literature, and so each student has different list of relevant papers to beg, buy, or ignore. In practice, this means that some or all of these literature reviews are a degree of incomplete? Is this an important degree? Maybe not. Fundamentally, the literature has been for decades a bit too wide for necessarily comprehensive review.

    If you look in the citations for a technical paper, the references are going to include a) journal citations you have access to b) journal citations that you do not have access to c) textbooks d) conferences, which can be painful e) stuff that can be truly difficult like ‘unpublished internal technical report’ or ‘wrote on a napkin over lunch in 1973’. In particular, proprietary software running on software and hardware not conveniently accessible may be a pain.

    Textbook references are nice, if you can get at the specific book, or if the things referenced are available in most titles of a particular narrow sort. They can suck if you cannot get at the work, or if it really is just one edition of one title, and you need to study the book a while to understand the notation and context.

    Every textbook citation after 192X can still be under copyright.

    There does not seem to be a lot of effort to find and digitize the pre 192X textbooks. Which have relevant bits, and bits that are definitely not enough to figure out the later literature from.

    192x to 197y is fifty years. Technically speaking, a lot of important things happened in that time. Forex, the advancements of WWII alone are a bit significant. Some university libraries have not yet purged the textbooks from that era from their collections. Dover publishing has business in find some of the great titles of history, getting the license, and printing more copies for sale. Hathitrust, has some digital copies. Also, DTIC makes available a lot of stuff that was produced with government money, where the feds own the copyright. Yet, you will find titles in citations from that era that are pretty hard to get a hold of.

    197y to now is another era. Technically, the rise in computer use, and the resulting vast increase in use of numerical methods, was a bit significant there. There have been some pretty significant changes in notations, approaches, etc. There are a number of older textbooks, which are definitely still under the original copyright, that may be important for comparing methods in literature and assessing validity, that were not originally published digitally, which may or may not be available in hard copy at reasonable price, and good luck with lending.

    Reasonable access to digital copies is a bit important for speeding work that may be of profound importance, or perhaps purely academic interest.

    I am probably disappointed with John Wiley. I wish I could say surprised, but I cannot.

    Liked by 2 people

  4. 1923 to 2023 has seen some huge changes in the practices behind some of the fundamentals of technology.

    People will say ‘of course’, and point to their favorite new technology, but there is an element that this misses.

    Over the 20th century and the first quarter of the 21st century, there have been substantial developments and discoveries in solid mechanics. How ever, it remains the same that you are very often deciding what shape to make a part or structure, then calculating or computing the effects of loads on it, and then deciding if you have chosen the correct design or not.

    The ways that we try to teach people to calculate or compute the effects of loads have changed substantially.

    You have empirical formula, and analytical or theoretical formula.

    You can solve the formula with a graphical method, a numerical method, or pencil and paper algebra or calculus like are currently studied in school.

    As far as tools go, there are continuous math or ‘analog’ tools, and there are discrete math tools, which are very often digital.

    In 1900, graphical calculation methods were very well developed, and widely used. With the plate printing methods of the day, you could print a complicated graph in a book, and then the practicing engineer could use that printing and a straight edge to arrive at a solution to the formula encoded in the graph. Also, some, maybe many, of the engineers of the day could prepare their own graphical calculation curves if they had a new formula. A lot of the fundamentals of this approach are no longer widely studied, because you can get better precision faster and more reliably using other methods. One of the few surviving methods of this style is “Mohr’s circle”, which is a scheme for a formula or two used in mechanics.

    The source of the formula is a bit less interesting here. Though, a lot of experiments and calculations went into even the fanciest of the ‘theoretical’ formula that we usually learn these days by ‘writing lines’ of mathematical derivation.

    There are several ways to implement a numerical method, or a ‘computer’.

    If I have a formula for the variable I want, and know values of all of the variables in the formula, I can calculate the value of the variable, unless it is a formula that ‘sucks’ in some way. If it sucks, and I have to approximate a step in some way, those approximations may be a numerical method.

    Some numerical methods have many steps, and doing them yourself by hand gets difficult. One approach, much too expensive in labor these days, is a team of people trained in the basic mathematical steps.

    Also no longer as heavily used, the case where a mathematician, a team of human computers, or a machine has been used to calculate an extensive table of values, and ordinary math workers are doing a bunch of table look ups.

    Like with a graphical calculation, you can make an analog computer out of a machine. See, slide rules. But, discrete mechanical calculators are also possible, see the abacus. Mechanical calculation, and electromechanical calculation, reached a very high level by the mid twentieth century. However, changing the ‘state’ of a mechanical computer, can be cumbersome, and complex mechanisms have maintenance issues. So general purpose mechancial computers that have very excellent capabilities can be difficult to realize.

    Third quarter of the 20th century, we started to develop the art of purely electrical analog computing. Analog computing can do tasks that are not native to digital methods, but noise contamination is inherently a problem with analog methods. Tubes and transistors can be used for analog methods, or for digital methods. But, solid state transistors are potentially cheaper and compact, and can have a much longer time between failure than tubes. This made digital logic possible in a small package, and digital logic allows for discrete mathematics and /ignoring/ noise in the digital signal.

    Tada, programmable calculators, and a bunch of fancy programs running on general purpose electronic digital computers that can do much heavy lifting. To include previously impossible levels of complexity in doing calculations with numerical methods.

    Computers are obviously a ‘new’ technology. The issue is the shifts in methods allowed by the computers. That one hundred years of copyrighted material potentially contains a great wealth of fundamental underpinnings that may not be easily available elsewhere. Obviosuly, if you can’t make money by distributing information, you have no incentive to find more information to make money by distributing. Extremely aggressive efforts to protect and monetize those copyrights could potentially close off access to this valuable information.

    Much hay is being made now in academia about adjusting processes so that ‘historically poor and disadvantaged’ groups find the study of engineering more ‘accessible’. Federal funding has poured a great deal of money into universities, and this has allowed sellers of textbooks and journals to charge very high prices. The public may trust ‘experts’ and ‘professionals’, but if the only way to become an expert to become a ‘made man’ in the academic ‘omerta’ the public is likely to wind up cheated, and very badly served. It is also better if the new generation of professionals has alternatives to blindly trusting what they have been told. For these reasons, I believe that there is a public good argument in the obstacles to accessing this information not being unduly burdensome.

    Liked by 2 people

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s