I was there when they broke the internet (coincidentally)

Started by David Pilling, June 01, 2023, 10:25:19 AM

Previous topic - Next topic

David Pilling

The PBS wiki consists of 1000's of text entries describing plants. These have been laboriously written by hand - most of them by Mary Sue Ittner, although many others have been involved. I have always found this difficult, just like writing essays at school on subjects I was not familiar with.

The latest wiki entry is for Chlorogalum grandiflorum:

https://www.pacificbulbsociety.org/pbswiki/index.php/Chlorogalum#grandiflorum

I got Google 'bard' to write the text, and I am very proud of the result. It took me no time at all and it is much better English than anything I could usually have done.

The problem is that I have no idea if the material is correct and by now it will have been indexed by Google and will shortly be fed back into 'bard' etc.

We could see if we can stop the wiki contents being used to train AI, or we could decide to not use AI when adding stuff to the wiki. Not that it will make any difference...


janemcgary

And after many iterations of writing nonfiction text with AI, we will have a trove of fantasy botany.

David Pilling

#2
I am going to analyse what happened here for the benefit of anyone Googling.

I used bard AI to generate the original PBS wiki entry for Chlorogalum grandiflorum - the generated text is shown in italics below.

The first chunk of the AI material consisted of copying everything on wikipedia:
https://en.wikipedia.org/wiki/Chlorogalum_grandiflorum
Zero marks for that.

is also known as Red Hills soap plant, it is a perennial wildflower that grows from a red or brown-coated bulb up to 7 centimeters wide. The basal leaves have very wavy edges. The inflorescence may be a meter long and is composed of many flowers, each with six tepals which are white with a purple midvein. The tepals are narrow, up to 3 centimeters long, and curl back as they spread open. Each ephemeral flower opens in the evening and closes by the following morning. There are six stamens tipped with yellow anthers. The fruit is a capsule just over half a centimeter long.

Chlorogalum grandiflorum is endemic to the Sierra Nevada foothills of California, where it grows in chaparral, woodland, and forest on serpentine soil. It is a rare plant, and its populations are declining due to habitat loss and fragmentation.


There is an addition to wikipedia "It is a rare plant..."

Beyond that there are some additions

The plant is named for its soapy sap, which can be used to wash clothes and hair. The plant was also used by Native Americans as a medicine for a variety of ailments.

I can't find this for grandiflorum but there is a very interesting article about Soap Plant (Chlorogalum pomeridianum)

https://www.fs.usda.gov/wildflowers/plant-of-the-week/chlorogalum_pomeridianum.shtml

Probably they're similar, kinda sorta right.

The plant is a host plant for the larvae of the checkerspot butterfly.
The plant is used in traditional Chinese medicine.


I can't find either of these. Why would a California native plant be used in China, assuming traditional means more than a couple of 100 years old.

The original wiki entry containing the information I've not been able to verify does at the time of writing appear on Google search results. But by now I have now removed any material unique to bard.


Lee Poulsen

There are some things ChatGPT and Bard are good at and some things where it's so bad that it "hallucinates", which is the term they're now using for when AI makes up a perfectly reasonable sounding article with several accurate sounding references, but all the facts are false. ChatGPT actually happens to be very good at constructing computer code just by telling it in English what you want the code to do. However, you need to know how code works yourself to change the code where it needs changing so it will do what you really want it to do. However, it will set up all the variables, structure, use commands from the computer language correctly, and even knows what libraries to import, including ones you didn't know existed that do exactly what you wanted.

However, I once tried to see if it could find more information on a bulb species that I had only found a couple of things about using Google extensively. It turned out to be worse than my using Google myself. I told it to tell me about Cearanthes fuscoviolacea which is an amaryllid from Brazil. It told me all about a purple flowered orchid from Central America. I repeated my request but inserted the words "the bulb" between "about" and "Cearanthes". And it responded that *I* was mistaken that it was a bulb because it is not; it is an orchid. From Central America. 

The only things it got right were that they both had purple flowers and they were both plants.
Pasadena, California, USA - USDA Zone 10a
Latitude 34°N, Altitude 1150 ft/350 m

David Pilling

Quote from: Lee Poulsen on June 12, 2023, 02:38:36 PMChatGPT actually happens to be very good at constructing computer code

Everyone can now be a programmer, but does everyone want to be a programmer - most of my time goes on tracking down hard to understand bugs in code. Code is just as complex when written in English as when written in a programming language. Experience has taught me to look at libraries with suspicion.

Martin Bohnet

Well, the good thing about libraries is: when a dangerous bug is present, lots of people have reason to find and eliminate it. The bad thing about libraries is: lots of people have reason to find and abuse bugs...
Martin (pronouns: he/his/him)

David Pilling

"Google warns its own employees: Do not use code generated by Bard"
https://www.theregister.com/2023/06/19/even_google_warns_its_own/

I'd be interested to see what sort of code it generates. I need the right project. Presumably "fix all the bugs in Windows 10" is not a good one. And "rotate a bitmap" is something I could do.

If you ask me to write some code, the first thing I will do is look at the web to see if anyone has done it already. I'm constantly copying bits of code from stack overflow etc - usually because I am programming in languages I don't often write code in.