LLMs in Science: Type 1 Discovery (my term)
This isn't new, probably.
Consider the topology or landscape in higher dimensions that is the back-propagation-sculpted informational surface that we navigate. That's still a thing, right?
Suppose that self-attention on that landscape, is taking the property of continuous (mathematically) semantic coordinate space (un-anchored at 0), discovered token by token in a traversal using each attention head that converges over time into a full LLM prompt response.
Now say we take that continuous property, and apply it to scientific discovery.
Whitepapers and all the other types of published research, experiments and such, are highly focused, well refined, well-tested, and easily thought at from so many directions that they are considered robust. The ideas that they are built on, that they embody, and they put-forth, are a singular thread across that continuous domain of informational surface.
Take a forest of research, similar or dissimilar, and create a landscape from it (as we have in many cases), and there will be gaps. Gaps, which can be opportunity, undiscovered musings, possibility that hasnt been discounted, ideas discarded by the old guarde--or left as an exercise to the reader; poorly. Because really they are not obvious except to that one amateur professor in the 1930s.
Those gaps, are what I've taken to call Type 1 Discovery, using LLMs et al. Finding the gaps in human knowledge that our lack of capacity, single mindedness, need for reproducibility, lack of grant money, or other limitations of past non-computational scientific method prevented us from exploring properly.