Using AI for anomaly detection is nothing new though. Haven't read any article about this specific 'discovery' but usually this uses a completely different technique than the AI that comes to mind when people think of AI these days.
Haven't read any article about this specific 'discovery' but usually this uses a completely different technique than the AI that comes to mind when people think of AI these days.
For the image-only DL model, we implemented a deep convolutional neural network (ResNet18 [13]) with PyTorch (version 0.31; pytorch.org). Given a 1664 × 2048 pixel view of a breast, the DL model was trained to predict whether or not that breast would develop breast cancer within 5 years.
The only "innovation" here is feeding full view mammograms to a ResNet18(2016 model). The traditional risk factors regression is nothing special (barely machine learning). They don't go in depth about how they combine the two for the hybrid model, so it's probably safe to assume it is something simple (merely combining the results, so nothing special in the training step). edit: I stand corrected, commenter below pointed out the appendix, and the regression does in fact come into play in the training step
As a different commenter mentioned, the data collection is largely the interesting part here.
I'll admit I was wrong about my first guess as to the network topology used though, I was thinking they used something like auto encoders (but that is mostly used in cases where examples of bad samples are rare)
They don't go in depth about how they combine the two for the hybrid model
Actually they did, it's in Appendix E (PDF warning)
. A GitHub repo would have been nice, but I think there would be enough info to replicate this if we had the data.
Yeah it's not the most interesting paper in the world. But it's still a cool use IMO even if it might not be novel enough to deserve a news article.
According to the paper cited by the article OP posted, there is no LLM in the model. If I read it correctly, the paper says that it uses PyTorch's implementation of ResNet18, a deep convolutional neural network that isn't specifically designed to work on text. So this term would be inaccurate.
or a pattern recognition model.
Much better term IMO, especially since it uses a convolutional network. But since the article is a news publication, not a serious academic paper, the author knows the term "AI" gets clicks and positive impressions (which is what their job actually is) and we wouldn't be here talking about it.
Well, this is very much an application of AI... Having more examples of recent AI development that aren't 'chatgpt'(/transformers-based) is probably a good thing.
Op is not saying this isn't using the techniques associated with the term AI. They're saying that the term AI is misleading, broad, and generally not desirable in a technical publication.
It's literally the name of the field of study. Chances are this uses the same thing as LLMs. Aka a neutral network, which are some of the oldest AIs around.
It refers to anything that simulates intelligence. They are using the correct word. People just misunderstand it.
The problem is that it refers to so many and constantly changing things that it doesn't refer to anything specific in the end. You can replace the word "AI" in any sentence with the word "magic" and it basically says the same thing...
It's really difficult to clean those data. Another case was, when they kept the markings on the training data and the result was, those who had cancer, had a doctors signature on it, so the AI could always tell the cancer from the not cancer images, going by the lack of signature. However, these people also get smarter in picking their training data, so it's not impossible to work properly at some point.
That’s the nice thing about machine learning, as it sees nothing but something that correlates. That’s why data science is such a complex topic, as you do not see errors this easily. Testing a model is still very underrated and usually there is no time to properly test a model.
For reference here in Australia my wife has been asking to get mammograms for years now (in her 30s) and she keeps getting told she’s too young because she doesn’t have a familial history. That issue is a bit pervasive in countries other than the US.
Better yet, give us something better to do about the cancer than slash, burn, poison. Something that's less traumatic on the rest of the person, especially in light of the possibility of false positives.
If it has just as low of a false negative rate as human-read mammograms, I see no issue. Feed it through the AI first before having a human check the positive results only. Save doctors' time when the scan is so clean that even the AI doesn't see anything fishy.
Alternatively, if it has a lower false positive rate, have doctors check the negative results only. If the AI sees something then it's DEFINITELY worth a biopsy. Then have a human doctor check the negative readings just to make sure they don't let anything that's worth looking into go unnoticed.
Either way, as long as it isn't worse than humans in both kinds of failures, it's useful at saving medical resources.
an image recognition model like this is usually tuned specifically to have a very low false negative (well below human, often) in exchange for a high false positive rate (overly cautious about cancer)!
This is exactly what is being done. My eldest child is in a Ph. D. program for human - robot interaction and medical intervention, and has worked on image analysis systems in this field. They're intended use is exactly that - a "first look" and "second look". A first look to help catch the small, easily overlooked pre-tumors, and tentatively mark clear ones. A second look to be a safety net for tired, overworked, or outdated eyes.
For me, the main takeaway doesn't have anything to do with the details though, it's about the true usefulness of AI. The details of the implementation aren't important, the general use case is the main point.
It's got a decent chunk of good uses. It's just that none of those are going to make anyone a huge ton of money, so they don't have a hype cycle attached. I can't wait until the grifters get out and the hype cycle falls away, so we can actually get back to using it for what it's good at and not shoving it indiscriminately into everything.
The hypesters and grifters do not prevent AI from being used for truly valuable things even now. In fact medical uses will be one of those things that WILL keep AI from just fading away.
Just look at those marketing wankers as a cherry on the top that you didn't want or need.
The hypesters and grifters do not prevent AI from being used for truly valuable things even now.
I mean, yeah, except that the unnecessary applications are all the corporations are paying anyone to do these days. When the hype flies around like this, the C-suite starts trying to micromanage the product team's roadmap. Once it dies down, they let us get back to work.
I think the vast majority of people understand that already. They don't understand just what all those gadgets are for anyway. Medicine is largely a ''blackbox" or magical process anyway.
There are way too many techbros trying to push the idea of turning chat gpt into a physician replacement. After it "passed" the board exams, they immediately started hollering about how physicians are outdated and too expensive and we can just replace them with AI. What that ignores is the fact that the board exam is multiple choice and a massive portion of medical student evaluation is on the "art" side of medicine that involves taking the history and performing the physical exam that the question stem provides for the multiple choice questions.
Also, for GPU prices to come down. Right now the AI garbage is eating a lot of the GPU production, as well as wasting a ton of energy. It sucks. Right as the crypto stuff started dying out we got AI crap.
It's a money saver, so it's profit model is all wonky.
A hospital, as a business, will make more money treating cancer than it will doing a mammogram and having a computer identify issues for preventative treatment.
A hospital, as a place that helps people, will still want to use these scans widely because "ignoring preventative care to profit off long term treatment" is a bit too "mask off" even for the US healthcare system and doctors would quit.
Insurance companies, however, would pay just shy of the cost of treatment to avoid paying for treatment.
So the cost will rise to be the cost of treatment times the incidence rate, scaled to the likelihood the scan catches something, plus system costs and staff costs.
In a sane system, we'd pass a law saying capable facilities must provide preventative screenings at cost where there's a reasonable chance the scan would provide meaningful information and have the government pay the bill. Everyone's happy except people who view healthcare as an investment opportunity.
A hospital, as a business, will make more money treating cancer than it will doing a mammogram and having a computer identify issues for preventative treatment.
I believe this idea was generally debunked a little while ago; to wit, the profit margin on cancer care just isn't as big (you have to pay a lot of doctors) as the profit margin on mammograms. Moreover, you're less likely to actually get paid the later you identify it (because end-of-life care costs for the deceased tend to get settled rather than being paid).
I'll come back and drop the article link here, if I can find it.
Oh interesting, I'd be happy to be wrong on that. :)
I figured they'd factor the staffing costs into what they charge the insurance, so it'd be more profit due to a higher fixed costs, longer treatment and some fixed percentage profit margin.
The estate costs thing is unfortunately an avenue I hadn't considered. :/
I still think it would be better if we removed the profit incentive entirely, but I'm pleased if the two interests are aligned if we have to have both.
It sure is. But this is basically just making something that already exists more reliable, not creating something new. Still important, but not as earth-shaking.
Those are going to make a ton of money for a lot of people. Every 1% fuel efficiency gained, every second saved in an industrial process, it's hundreds of millions of dollars.
You don't need AI in your fridge or in your snickers, that will (hopefully) die off, but AI is not going away where it matters.
Those are going to make a ton of money for a lot of people.
Right, but not any *one* person. The people running the hype train want to be that one person, but the real uses just aren't going to be something you can exclusively monetize.
Depends how you define "a ton" of money. Plenty of startups have been acquired for silly amounts of money, plenty of consultants are making bank, make executives are cashing big bonuses for successful improvements using AI...
I define "a ton" of money in this case to mean "the amount they think of when they get the dollar signs in their eyes." People are cashing in on that delusion right now, but it's not going to last.
machine learning is a type of AI. scifi movies just misused the term and now the startups are riding the hype trains. AGI =/= AI. there's lots of stuff to complain about with ai these days like stable diffusion image generation and LLMs, but the fact that they are AI is simply true.
I mean it's entirely an arbitrary distinction. AI, for a very long time before chatGPT, meant something like AGI. we didn't call classification models 'intelligent' because it didn't have any human-like characteristics. It's as silly as saying a regression model is AI. They aren't intelligent things.
I once had ideas about building a machine learning program to assist workflows in Emergency Departments, and its' training data would be entirely generated by the specific ER it's deployed in. Because of differences in populations, the data is not always readily transferable between departments.
This is a different type of AI that doesn't have as many consumer facing qualities.
The ones that are being pushed now are the first types of AI to have an actually discernable consumer facing attribute or behavior, and so they're being pushed because no one wants to miss the boat.
They're not more profitable or better or actually doing anything anyone wants for the most part, they're just being used where they can fit it in.
This type of segmentation is of declining practical value. Modern AI implementations are usually hybrids of several categories of constructed intelligence.
Unfortunately AI models like this one often never make it to the clinic. The model could be impressive enough to identify 100% of cases that will develop breast cancer. However if it has a false positive rate of say 5% it’s use may actually create more harm than it intends to prevent.
Another big thing to note, we recently had a different but VERY similar headline about finding typhoid early and was able to point it out more accurately than doctors could.
But when they examined the AI to see what it was doing, it turns out that it was weighing the specs of the machine being used to do the scan... An older machine means the area was likely poorer and therefore more likely to have typhoid. The AI wasn't pointing out if someone had Typhoid it was just telling you if they were in a rich area or not.
That's actually really smart. But that info wasn't given to doctors examining the scan, so it's not a fair comparison. It's a valid diagnostic technique to focus on the particular problems in the local area.
"When you hear hoofbeats, think horses not zebras" (outside of Africa)
AI is weird. It may not have been given the information explicitly. Instead it could be an artifact in the scan itself due to the different equipment. Like if one scan was lower resolution than the others but you resized all of the scans to be the same size as the lowest one the AI might be picking up on the resizing artifacts which are not present in the lower resolution one.
The manufacturing date of the scanner was actually saved as embedded metadata to the scan files themselves. None of the researchers considered that to be a thing until after the experiment when they found that it was THE thing that the machines looked at.
I'm saying that info is readily available to doctors in real life. They are literally in the hospital and know what the socioeconomic background of the patient is. In real life they would be able to guess the same.
The thing is tho... It has a better detection rate ON THE SAMPLES THEY HAD but because it wasn't actually detecting anything other than wealth there was no way for them to trust it would stay accurate.
Right, there's typically separate "training" and "validation" sets for a model to train, validate, and iterate on, and then a totally separate "test" dataset that measures how effective the model is on similar data that it wasn't trained on.
If the model gets good results on the validation dataset but less good on the test dataset, that typically means that it's "over fit". Essentially the model started memorizing frivolous details specific to the validation set that while they do improve evaluation results on that specific dataset, they do nothing or even hurt the results for the testing and other datasets that weren't a part of training. Basically, the model failed to abstract what it's supposed to detect, only managing good results in validation through brute memorization.
I'm not sure if that's quite what's happening in maven's description though. If it's real my initial thoughts are an unrepresentative dataset + failing to reach high accuracy to begin with. I buy that there's a correlation between machine specs and positive cases, but I'm sure it's not a perfect correlation. Like maven said, old areas get new machines sometimes. If the models accuracy was never high to begin with, that correlation may just be the models best guess. Even though I'm sure that it would always take machine specs into account as long as they're part of the dataset, if actual symptoms correlate more strongly to positive diagnoses than machine specs do, then I'd expect the model to evaluate primarily on symptoms, and thus be more accurate. Sorry this got longer than I wanted
What if one of those lower economic areas decides that the machine is too old and they need to replace it with a brand new one? Now every single case is a false negative because of how highly that was rated in the system.
The data they had collected followed that trend but there is no way to think that it'll last forever or remain consistent because it isn't about the person it's just about class.
That’s just not generally true. Mammograms are usually only recommended to women over 40. That’s because the rates of breast cancer in women under 40 are low enough that testing them would cause more harm than good thanks in part to the problem of false positives.
Nearly 4 out of 5 that progress to biopsy are benign. Nearly 4 times that are called for additional evaluation. The false positives are quite high compared to other imaging. It is designed that way, to decrease the chances of a false negative.
The false negative rate is also quite high. It will miss about 1 in 5 women with cancer. The reality is mammography is just not all that powerful as a screening tool. That’s why the criteria for who gets screened and how often has been tailored to try and ensure the benefits outweigh the risks. Although it is an ongoing debate in the medical community to determine just exactly what those criteria should be.
A false positive of even 50% can mean telling the patient "they are at a higher risk of developing breast cancer and should get screened every 6 months instead of every year for the next 5 years".
Keep in mind that women have about a 12% chance of getting breast cancer at some point in their lives. During the highest risk years its a 2 percent chamce per year, so a machine with a 50% false positive for a 5 year prediction would still only be telling like 15% of women to be screened more often.
How would a false positive create more harm? Isn't it better to cast a wide net and detect more possible cases? Then false negatives are the ones that worry me the most.
It’s a common problem in diagnostics and it’s why mammograms aren’t recommended to women under 40.
Let’s say you have 10,000 patients. 10 have cancer or a precancerous lesion. Your test may be able to identify all 10 of those patients. However, if it has a false positive rate of 5% that’s around 500 patients who will now get biopsies and potentially surgery that they don’t actually need. Those follow up procedures carry their own risks and harms for those 500 patients. In total, that harm may outweigh the benefit of an earlier diagnosis in those 10 patients who have cancer.
Well it'd certainly benefit the medical industry. They'd be saddling tons of patients with surgeries, chemotherapy, mastectomy, and other treatments, "because doctor-GPT said so."
But imagine being a patient getting physically and emotionally altered, plunged into irrecoverable debt, distressing your family, and it all being a whoopsy by some black-box software.
That's a good point, that it could burden the system, but why would you ever put someone on chemotherapy for the model described in the paper? It seems more like it could burden the system by increasing the number of patients doing more frequent screening. Someone has to pay for all those docter-patient and meeting hours for sure. But the benefit outweighs this cost (which in my opinion is good and cheap since it prevents future treatment at later stages that are expensive).
Biopsies are small but still invasive. There's risk of infection or reactions to anesthesia in any surgery. If 100 million women get this test, a 5% false positive rate will mean 5 million unnecessary interventions. Not to mention the stress of being told you have cancer.
5 million unnecessary interventions means a small percentage of those people (thousands) will die or be harmed by the treatment. That's the harm that it causes.
You have really good point too! Maybe just an indication of higher risk, and just saying "Hey, screening more often couldn't hurt." Might actually be a net positive, and wouldn't warrant such extreme measures unless it was positively identified by, hopefully, human professionals.
You're right though, there always seems to be more demand than supply for anything medicine related. Not to mention, here in the U.S for example, needless extra screenings could also heavily impact a lot of people.
Serious question: is there a way to get access to medical imagery as a non-student? I would love to do some machine learning with it myself, as I see lot’s of potential in image analysis in general. 5 years ago I created a model that was able to spot certain types of ships based only on satellite imagery, which were not easily detectable by eye and ignoring the fact that one human cannot scan 15k images in one hour.
Similar use case with medical imagery - seeing the things that are not yet detectable by human eyes.
Yeah there are some openly available datasets on competition sites like Kaggle, and some medical data is available through public institutions like like NIH.
This is similar to wat I did for my masters, except it was lung cancer.
Stuff like this is actually relatively easy to do, but the regulations you need to conform to and the testing you have to do first are extremely stringent. We had something that worked for like 95% of cases within a couple months, but it wasn't until almost 2 years later they got to do their first actual trial.
I had a housemate a couple of years ago who had a side job where she'd look through a load of these and confirm which were accurate. She didn't say it was AI though.
This is a great use of tech. With that said I find that the lines are blurred between "AI" and Machine Learning.
Real Question: Other than the specific tuning of the recognition model, how is this really different from something like Facebook automatically tagging images of you and your friends? Instead of saying "Here's a picture of Billy (maybe) " it's saying, "Here's a picture of some precancerous masses (maybe)".
That tech has been around for a while (at least 15 years). I remember Picasa doing something similar as a desktop program on Windows.
I've been looking at the paper, some things about it:
- the paper and article are from 2021
- the model needs to be able to use optional data from age, family history, etc, but not be reliant on it
- it needs to combine information from multiple views
- it predicts risk for each year in the next 5 years
- it has to produce consistent results with different sensors and diverse patients
- its not the first model to do this, and it is more accurate than previous methods
I think it's a joke, like to imply they want to not just reiterate, but rerererereiterate this information, both because it's good news and also in light of all the sucky ways AI is being used instead.
Like at first they typed, "I just want to reiterate.. but decided that wasn't nearly enough.
Ductal carcinoma in situ (DCIS) is a type of preinvasive tumor that sometimes progresses to a highly deadly form of breast cancer. It accounts for about 25 percent of all breast cancer diagnoses.
Because it is difficult for clinicians to determine the type and stage of DCIS, patients with DCIS are often overtreated. To address this, an interdisciplinary team of researchers from MIT and ETH Zurich developed an AI model that can identify the different stages of DCIS from a cheap and easy-to-obtain breast tissue image. Their model shows that both the state and arrangement of cells in a tissue sample are important for determining the stage of DCIS.
Our clinics are already using ai to clean up MRI images for easier and higher quality reads. We use ai on our cath lab table to provide a less noisy image at a much lower rad dose.
It’s not diagnosing, which is good imho. It’s just being used to remove noise and artifacts from the images on the scan. This means the MRI is clearer for the reading physician and ordering surgeon in the case of the MRI and that the cardiologist can use less radiation during the procedure yet get the same quality image in the lab.
I’m still wary of using it to diagnose in basically any scenario because of the salience and danger that both false negatives and false positives threaten.
The most beneficial application of AI like this is to reverse-engineer the neural network to figure out how the AI works. In this way we may discover a new technique or procedure, or we might find out the AI's methods are bullshit. Under no circumstance should we accept a "black box" explanation.
good luck reverse-engineering millions if not billions of seemingly random floating point numbers. It's like visualizing a graph in your mind by reading an array of numbers, except in this case the graph has as many dimensions as the neural network has inputs, which is the number of pixels the input image has.
Under no circumstance should we accept a "black box" explanation.
Go learn at least basic principles of neural networks, because this your sentence alone makes me want to slap you.
Don't worry, researchers will just get an AI to interpret all those floating point numbers and come up with a human-readable explanation! What could go wrong? /s
As a practical example, new regulations by the European Union proposed that individuals affected by algorithmic decisions have a right to an explanation. To allow this, algorithmic decisions must be explainable, contestable, and modifiable in the case that they are incorrect.
Oh yeah. I forgot about that. I hope your model is understandable enough that it doesn't get you in trouble with the EU.
Look, maybe you were having a bad day, or maybe slapping people is literally your favorite thing to do, who am I to take away mankind's finer pleasures, but this attitude of yours is profoundly stupid. It's weak. You don't want to know? It doesn't make you curious? Why are you comfortable not knowing things? That's not how science is propelled forward.
"Enough" is doing a fucking ton of heavy lifting there. You cannot explain a terabyte of floating point numbers. Same way you cannot *guarantee* a specific doctor or MRI technician isn't racist.
A single drop of water contains billions of molecules, and yet, we can explain a river. Maybe you should try applying yourself. The field of hydrology awaits you.
No, we cannot explain a river, or the atmosphere. Hence weather forecast is good for a few days and even after massive computer simulations, aircraft/cars/ships still need to do tunnel testing and real life testing. Because we only can approximate the real thing in our model.
This ones from 2019 Link
I was a bit off the mark, its not that the models they use aren't black boxes its just that they could have made them interpretable from the beginning and chose not to, likely due to liability.
Well in theory you can explain how the model comes to it's conclusion. However I guess that 0.1% of the "AI Engineers" are actually capable of that. And those costs probably 100k per month.
It depends on the algorithms used. Now the lazy approach is to just throw neural networks at everything and waste immense computation ressources. Of course you then get results that are difficult to interpret. There is much more efficient algorithms that are working well to solve many problems and give you interpretable decisions.
IMO, the "black box" thing is basically ML developers hand waiving and saying "it's magic" because they know it will take way too long to explain all the underlying concepts in order to even start to explain how it works.
I have a very crude understanding of the technology. I'm not a developer, I work in IT support. I have several friends that I've spoken to about it, some of whom have made fairly rudimentary machine learning algorithms and neural nets. They understand it, and they've explained a few of the concepts to me, and I'd be lying if I said that none of it went over my head. I've done programming and development, I'm senior in my role, and I have a lifetime of technology experience and education... And it goes over my head. What hope does anyone else have? If you're not a developer or someone ML-focused, yeah, it's basically magic.
I won't try to explain. I couldn't possibly recall enough about what has been said to me, to correctly explain anything at this point.
The AI developers understand how AI works, but that does not mean that they understand the thing that the AI is trained to detect.
For instance, the cutting edge in protein folding (at least as of a few years ago) is Google's AlphaFold. I'm sure the AI researchers behind AlphaFold understand AI and how it works. And I am sure that they have an above average understanding of molecular biology. However, they do not understand protein folding better than the physisits and chemists who have spent their lives studying the field. The core of their understanding is "the answer is somewhere in this dataset. All we need to do is figure out how to through ungoddly amounts of compute at it, and we can make predictions". Working out how to productivly throw that much compute power at a problem is not easy either, and *that* is what ML researchers understand and are experts in.
In the same way, the researchers here understand how to go from a large dataset of breast images to cancer predictions, but that does not mean they have any understanding of cancer. And certainly not a better understanding than the researchers who have spent their lives studying it.
An open problem in ML research is how to take the billions of parameters that define an ML model and extract useful information that can provide insights to help human experts understand the system (both in general, and in understanding the reasoning for a specific classification). Progress has been made here as well, but it is still a long way from being solved.
Thank you for giving some insights into ML, that is now often just branded "AI". Just one note though. There is many ML algorithms that do not employ neural networks. They don't have billions of parameters. Especially in binary choice image recognition (looks like cancer or no) stuff like support vector machines achieve great results and they have very few parameters.
Machine learning is a subset of Artificial intelligence, which is a field of research as old as computer science itself
The traditional goals of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception, and support for robotics.[a] General intelligence—the ability to complete any task performable by a human on an at least equal level—is among the field's long-term goals.[16]
*shrugs* you know people have been confidently making these kinds of statements... since written language was invented? I bet the first person who developed written language did it to complain about how this generation of kids don't know how to write a proper sentence.
What is in freefall is the economy for the middle and working class and basic idea that artists and writers should be compensated, period. What has released us into freefall is that making art and crafting words are shit on by society as not a respectable job worth being paid a living wage for.
There are a terrifying amount of good writers out there, more than there have ever been, both in total number AND per capita.
Sure, I definitely overreacted and I honestly was pretty stressed out the day I replied so yeah, fair. I think I have a point, this just wasn't the salient place for it and I was too tired to realize that in the moment.
The AI genie is out of the bottle and — as much as we complain — it isn't going away; we need thoughtful legislation. AI is going to take my job? Fine, I guess? That sounds good, really. Can I have a guaranteed income to live on, because I still need to live? Can we tax the rich?
Terrible things happen to people you love, you have two choices in this life. You can laugh about it or you can cry about it. You can do one and then the other if you choose. I prefer to laugh about most things and hope others will do the same. Cheers.
I mean do whatever you want but it just comes off as repulsive. like a stain of shit on the new shoes.
This is public space after all, not the bois locker room so that might be embarrassing for you.
And you know you can always count on me to point stuff out so you can avoid humiliation in the future
Thanks for your excessively unnecessary put down. Don't worry though. No matter how hard you try, you won't be able to stop me from enjoying my life and bringing joy to others. Why are you obsessed with shit btw?
The test is 90% accurate, thats still pretty useful. Especially if you are simply putting people into a high risk group that needs to be more closely monitored.
Also, where the hell did you pull that number from?
Well, you can just do the math yourself, it's pretty straight-forward.
However, more to the point, it's taken right from around 38 seconds into the video. Kind of funny to be accused of "not watching the video" by someone who is implying the number was pulled from nowhere, when it's right in the video.
I certainly don't think this closes the book on anything, but I'm responding to your claim that it's not useful. If this is a cheap and easy test, it's a great screening tool putting people into groups of low risk/high risk for which further, maybe more expensive/specific/sensitive, tests can be done. Especially if it can do this early.
It's probably more "AI" than the LLMs we've been plagued with. This sounds more like an application of machine learning, which is a hell of a lot more promising.
AI and machine learning are very similar (if not identical) things, just one has been turned into a marketing hype word a whole lot more than the other.
Machine learning is one of the many things that is referred to by "AI", yes.
My thought is the term "AI" has been overused to uselessness, from the nested if statements that decide how video game enemies move to various kinds of machine learning to large language models.
AI == Computer Thingy that looks kinda "smart" to people that don't understand it. it's like rectangles and squares. you should use the more precise word (CNN, LLM, Stable diffusion) when applicable, just like with rectangles and squares
This seems exactly like what I would have referred to as AI before the pandemic. Specifically Deep Learning image processing. In terms of something you can buy off the shelf this is theoretically something the Cognex Vidi Red Tool could be used for. My experience with it is in packaging, but the base concept is the same.
Training a model requires loading images into the software and having a human mark them before having a very powerful CUDA GPU process all of that. Once the model has been trained it can usually be run on a fairly modest PC in comparison.
And much before that it was rule-based machine learning, which was basically databases and fancy inference algorithms.
So I guess "AI" has always meant "the most advanced computer science thing which looks kind of intelligent". It's only now that it looks intelligent enough to fool laypeople into thinking there actually is intelligence there.
Haha I love Gell-Mann amnesia. A few weeks ago there was news about speeding up the internet to gazillion bytes per nanosecond and it turned out to be fake.
Now this thing is all over the internet and everyone believes it.
it's ai, it all is. the code that controls where the creepers in Minecraft go? AI. the tiny little neural network that can detect numbers? also AI! is it AGI? no. it's still AI. it's not that modern tech is stealing the term ai, scifi movies are the ones that started misusing it and cash grab startups are riding the hypetrain.
Deleted by author
Using AI for anomaly detection is nothing new though. Haven't read any article about this specific 'discovery' but usually this uses a completely different technique than the AI that comes to mind when people think of AI these days.
From the conclusion of the actual paper:
If I read this paper correctly, the novelty is in the model, which is a deep learning model that works on mammogram images + traditional risk factors.
I skimmed the paper. As you said, they made a ML model that takes images and traditional risk factors (TCv8).
I would love to see comparison against risk factors + human image evaluation.
Nevertheless, this is the AI that will really help humanity.
The only "innovation" here is feeding full view mammograms to a ResNet18(2016 model). The traditional risk factors regression is nothing special (barely machine learning). They don't go in depth about how they combine the two for the hybrid model,
so it's probably safe to assume it is something simple (merely combining the results, so nothing special in the training step).edit: I stand corrected, commenter below pointed out the appendix, and the regression does in fact come into play in the training stepAs a different commenter mentioned, the data collection is largely the interesting part here.
I'll admit I was wrong about my first guess as to the network topology used though, I was thinking they used something like auto encoders (but that is mostly used in cases where examples of bad samples are rare)
Actually they did, it's in Appendix E (PDF warning) . A GitHub repo would have been nice, but I think there would be enough info to replicate this if we had the data.
Yeah it's not the most interesting paper in the world. But it's still a cool use IMO even if it might not be novel enough to deserve a news article.
ResNet18 is ancient and tiny…I don’t understand why they didn’t go with a deeper network. ResNet50 is usually the smallest I’ll use.
That's why I hate the term AI. Say it is a predictive llm or a pattern recognition model.
According to the paper cited by the article OP posted, there is no LLM in the model. If I read it correctly, the paper says that it uses PyTorch's implementation of ResNet18, a deep convolutional neural network that isn't specifically designed to work on text. So this term would be inaccurate.
Much better term IMO, especially since it uses a convolutional network. But since the article is a news publication, not a serious academic paper, the author knows the term "AI" gets clicks and positive impressions (which is what their job actually is) and we wouldn't be here talking about it.
That performance curve seems terrible for any practical use.
Yeah that's an unacceptably low ROC curve for a medical usecase
Good catch!
Well, this is very much an application of AI... Having more examples of recent AI development that aren't 'chatgpt'(/transformers-based) is probably a good thing.
Op is not saying this isn't using the techniques associated with the term AI. They're saying that the term AI is misleading, broad, and generally not desirable in a technical publication.
The correct term is "Computational Statistics"
Stop calling it that, you're scaring the venture capital
it's a good term, it refers to lots of thinks. there are many terms like that.
So it's a bad term.
the word program refers to even more things and no one says it's a bad word.
It's literally the name of the field of study. Chances are this uses the same thing as LLMs. Aka a neutral network, which are some of the oldest AIs around.
It refers to anything that simulates intelligence. They are using the correct word. People just misunderstand it.
The problem is that it refers to so many and constantly changing things that it doesn't refer to anything specific in the end. You can replace the word "AI" in any sentence with the word "magic" and it basically says the same thing...
It's really difficult to clean those data. Another case was, when they kept the markings on the training data and the result was, those who had cancer, had a doctors signature on it, so the AI could always tell the cancer from the not cancer images, going by the lack of signature. However, these people also get smarter in picking their training data, so it's not impossible to work properly at some point.
Citation please?
That’s the nice thing about machine learning, as it sees nothing but something that correlates. That’s why data science is such a complex topic, as you do not see errors this easily. Testing a model is still very underrated and usually there is no time to properly test a model.
Why do I still have to work my boring job while AI gets to create art and look at boobs?
Because life is suffering and machines dream of electric sheeps.
I’ve seen things you people wouldn’t believe.
I dream of boobs.
Now make mammograms not $500 and not have a 6 month waiting time and make them available for women under 40. Then this'll be a useful breakthrough
It's already this way in most of the world.
Oh for sure. I only meant in the US where MIT is located. But it's already a useful breakthrough for everyone in civilized countries
For reference here in Australia my wife has been asking to get mammograms for years now (in her 30s) and she keeps getting told she’s too young because she doesn’t have a familial history. That issue is a bit pervasive in countries other than the US.
I think it’s free in most of Europe, or relatively cheap
Better yet, give us something better to do about the cancer than slash, burn, poison. Something that's less traumatic on the rest of the person, especially in light of the possibility of false positives.
Also, flying cars and the quadrature of the circle.
Done.
If it has just as low of a false negative rate as human-read mammograms, I see no issue. Feed it through the AI first before having a human check the positive results only. Save doctors' time when the scan is so clean that even the AI doesn't see anything fishy.
Alternatively, if it has a lower false positive rate, have doctors check the negative results only. If the AI sees something then it's DEFINITELY worth a biopsy. Then have a human doctor check the negative readings just to make sure they don't let anything that's worth looking into go unnoticed.
Either way, as long as it isn't worse than humans in both kinds of failures, it's useful at saving medical resources.
an image recognition model like this is usually tuned specifically to have a very low false negative (well below human, often) in exchange for a high false positive rate (overly cautious about cancer)!
This is exactly what is being done. My eldest child is in a Ph. D. program for human - robot interaction and medical intervention, and has worked on image analysis systems in this field. They're intended use is exactly that - a "first look" and "second look". A first look to help catch the small, easily overlooked pre-tumors, and tentatively mark clear ones. A second look to be a safety net for tired, overworked, or outdated eyes.
You in QA?
HAHAHAHA thank fuck I am not
Nice comment. I like the detail.
For me, the main takeaway doesn't have anything to do with the details though, it's about the true usefulness of AI. The details of the implementation aren't important, the general use case is the main point.
I can do that too, but my rate of success is very low
Ok, I'll concede. Finally a good use for AI. Fuck cancer.
It's got a decent chunk of good uses. It's just that none of those are going to make anyone a huge ton of money, so they don't have a hype cycle attached. I can't wait until the grifters get out and the hype cycle falls away, so we can actually get back to using it for what it's good at and not shoving it indiscriminately into everything.
The hypesters and grifters do not prevent AI from being used for truly valuable things even now. In fact medical uses will be one of those things that WILL keep AI from just fading away.
Just look at those marketing wankers as a cherry on the top that you didn't want or need.
I mean, yeah, except that the unnecessary applications are all the corporations are paying anyone to do these days. When the hype flies around like this, the C-suite starts trying to micromanage the product team's roadmap. Once it dies down, they let us get back to work.
People just need to understand that the true medical uses are as *tools* for physicians, not "replacements" for physicians.
I think the vast majority of people understand that already. They don't understand just what all those gadgets are for anyway. Medicine is largely a ''blackbox" or magical process anyway.
There are way too many techbros trying to push the idea of turning chat gpt into a physician replacement. After it "passed" the board exams, they immediately started hollering about how physicians are outdated and too expensive and we can just replace them with AI. What that ignores is the fact that the board exam is multiple choice and a massive portion of medical student evaluation is on the "art" side of medicine that involves taking the history and performing the physical exam that the question stem provides for the multiple choice questions.
Also, for GPU prices to come down. Right now the AI garbage is eating a lot of the GPU production, as well as wasting a ton of energy. It sucks. Right as the crypto stuff started dying out we got AI crap.
Yeah, fuck that detecting cancer crap, I want to game!
GPU price hikes are causing problems outside of the gaming industry, too. Imaging, scientific research, astronomy...
Might be, but I somehow don't picture an astronomer complaining about GPU prices on lemmy...
You missed that we were talking about the useless AI garbage, didn't you? I guess humans can also put out garbage...
What article is this comment section about?
A cure for cancer, if it can be literally nipped in the bud, seems like a possible money-maker to me.
It's a money saver, so it's profit model is all wonky.
A hospital, as a business, will make more money treating cancer than it will doing a mammogram and having a computer identify issues for preventative treatment.
A hospital, as a place that helps people, will still want to use these scans widely because "ignoring preventative care to profit off long term treatment" is a bit too "mask off" even for the US healthcare system and doctors would quit.
Insurance companies, however, would pay just shy of the cost of treatment to avoid paying for treatment.
So the cost will rise to be the cost of treatment times the incidence rate, scaled to the likelihood the scan catches something, plus system costs and staff costs.
In a sane system, we'd pass a law saying capable facilities must provide preventative screenings at cost where there's a reasonable chance the scan would provide meaningful information and have the government pay the bill. Everyone's happy except people who view healthcare as an investment opportunity.
I believe this idea was generally debunked a little while ago; to wit, the profit margin on cancer care just isn't as big (you have to pay a lot of doctors) as the profit margin on mammograms. Moreover, you're less likely to actually get paid the later you identify it (because end-of-life care costs for the deceased tend to get settled rather than being paid).
I'll come back and drop the article link here, if I can find it.
Oh interesting, I'd be happy to be wrong on that. :)
I figured they'd factor the staffing costs into what they charge the insurance, so it'd be more profit due to a higher fixed costs, longer treatment and some fixed percentage profit margin.
The estate costs thing is unfortunately an avenue I hadn't considered. :/
I still think it would be better if we removed the profit incentive entirely, but I'm pleased if the two interests are aligned if we have to have both.
That's not what this is, though. This is early detection, which is awesome and super helpful, but way less game-changing than an actual cure.
It's not a cure in itself, but isn't early detection a good way to catch it early and in many cases kill it before it spreads?
It sure is. But this is basically just making something that already exists more reliable, not creating something new. Still important, but not as earth-shaking.
Those are going to make a ton of money for a lot of people. Every 1% fuel efficiency gained, every second saved in an industrial process, it's hundreds of millions of dollars.
You don't need AI in your fridge or in your snickers, that will (hopefully) die off, but AI is not going away where it matters.
Well, AI has been in those places for a while. The hype cycle is around generative AI which just isn't useful for that type of thing.
I'm sure if Nvidia, AMD, Apple and Co create npus or tpus for Gen ai they can also be used for those places, thus improving them along.
Why do you think that?
Nothing I've seen with current generative AI techniques leads me to believe that it has any particular utility for system design or architecture.
There *are* AI techniques that can help with such things, they're just not the generative variety.
Right, but not any *one* person. The people running the hype train want to be that one person, but the real uses just aren't going to be something you can exclusively monetize.
Depends how you define "a ton" of money. Plenty of startups have been acquired for silly amounts of money, plenty of consultants are making bank, make executives are cashing big bonuses for successful improvements using AI...
I define "a ton" of money in this case to mean "the amount they think of when they get the dollar signs in their eyes." People are cashing in on that delusion right now, but it's not going to last.
Honestly they should go back to calling useful applications ML (that is what it is) since AI is getting such a bad rap.
machine learning is a type of AI. scifi movies just misused the term and now the startups are riding the hype trains. AGI =/= AI. there's lots of stuff to complain about with ai these days like stable diffusion image generation and LLMs, but the fact that they are AI is simply true.
I mean it's entirely an arbitrary distinction. AI, for a very long time before chatGPT, meant something like AGI. we didn't call classification models 'intelligent' because it didn't have any human-like characteristics. It's as silly as saying a regression model is AI. They aren't intelligent things.
I once had ideas about building a machine learning program to assist workflows in Emergency Departments, and its' training data would be entirely generated by the specific ER it's deployed in. Because of differences in populations, the data is not always readily transferable between departments.
AI should be used for this, yes, however advertisement is more profitable.
It's worse than that.
This is a different type of AI that doesn't have as many consumer facing qualities.
The ones that are being pushed now are the first types of AI to have an actually discernable consumer facing attribute or behavior, and so they're being pushed because no one wants to miss the boat.
They're not more profitable or better or actually doing anything anyone wants for the most part, they're just being used where they can fit it in.
This type of segmentation is of declining practical value. Modern AI implementations are usually hybrids of several categories of constructed intelligence.
Unfortunately AI models like this one often never make it to the clinic. The model could be impressive enough to identify 100% of cases that will develop breast cancer. However if it has a false positive rate of say 5% it’s use may actually create more harm than it intends to prevent.
That's why these systems should never be used as the sole decision makers, but instead work as a tool to help the professionals make better decisions.
Keep the human in the loop!
Another big thing to note, we recently had a different but VERY similar headline about finding typhoid early and was able to point it out more accurately than doctors could.
But when they examined the AI to see what it was doing, it turns out that it was weighing the specs of the machine being used to do the scan... An older machine means the area was likely poorer and therefore more likely to have typhoid. The AI wasn't pointing out if someone had Typhoid it was just telling you if they were in a rich area or not.
That's actually really smart. But that info wasn't given to doctors examining the scan, so it's not a fair comparison. It's a valid diagnostic technique to focus on the particular problems in the local area.
"When you hear hoofbeats, think horses not zebras" (outside of Africa)
AI is weird. It may not have been given the information explicitly. Instead it could be an artifact in the scan itself due to the different equipment. Like if one scan was lower resolution than the others but you resized all of the scans to be the same size as the lowest one the AI might be picking up on the resizing artifacts which are not present in the lower resolution one.
The manufacturing date of the scanner was actually saved as embedded metadata to the scan files themselves. None of the researchers considered that to be a thing until after the experiment when they found that it was THE thing that the machines looked at.
I'm saying that info is readily available to doctors in real life. They are literally in the hospital and know what the socioeconomic background of the patient is. In real life they would be able to guess the same.
That is quite a statement that it still had a better detection rate than doctors.
What is more important, save life or not offend people?
The thing is tho... It has a better detection rate ON THE SAMPLES THEY HAD but because it wasn't actually detecting anything other than wealth there was no way for them to trust it would stay accurate.
Citation needed.
Usually detection rates are given on a new set of samples, on the samples they used for training detection rate would be 100% by definition.
Right, there's typically separate "training" and "validation" sets for a model to train, validate, and iterate on, and then a totally separate "test" dataset that measures how effective the model is on similar data that it wasn't trained on.
If the model gets good results on the validation dataset but less good on the test dataset, that typically means that it's "over fit". Essentially the model started memorizing frivolous details specific to the validation set that while they do improve evaluation results on that specific dataset, they do nothing or even hurt the results for the testing and other datasets that weren't a part of training. Basically, the model failed to abstract what it's supposed to detect, only managing good results in validation through brute memorization.
I'm not sure if that's quite what's happening in maven's description though. If it's real my initial thoughts are an unrepresentative dataset + failing to reach high accuracy to begin with. I buy that there's a correlation between machine specs and positive cases, but I'm sure it's not a perfect correlation. Like maven said, old areas get new machines sometimes. If the models accuracy was never high to begin with, that correlation may just be the models best guess. Even though I'm sure that it would always take machine specs into account as long as they're part of the dataset, if actual symptoms correlate more strongly to positive diagnoses than machine specs do, then I'd expect the model to evaluate primarily on symptoms, and thus be more accurate. Sorry this got longer than I wanted
What if one of those lower economic areas decides that the machine is too old and they need to replace it with a brand new one? Now every single case is a false negative because of how highly that was rated in the system.
The data they had collected followed that trend but there is no way to think that it'll last forever or remain consistent because it isn't about the person it's just about class.
Breast imaging already relys on a high false positive rate. False positives are way better than false negatives in this case.
That’s just not generally true. Mammograms are usually only recommended to women over 40. That’s because the rates of breast cancer in women under 40 are low enough that testing them would cause more harm than good thanks in part to the problem of false positives.
Nearly 4 out of 5 that progress to biopsy are benign. Nearly 4 times that are called for additional evaluation. The false positives are quite high compared to other imaging. It is designed that way, to decrease the chances of a false negative.
The false negative rate is also quite high. It will miss about 1 in 5 women with cancer. The reality is mammography is just not all that powerful as a screening tool. That’s why the criteria for who gets screened and how often has been tailored to try and ensure the benefits outweigh the risks. Although it is an ongoing debate in the medical community to determine just exactly what those criteria should be.
Not at all, in this case.
A false positive of even 50% can mean telling the patient "they are at a higher risk of developing breast cancer and should get screened every 6 months instead of every year for the next 5 years".
Keep in mind that women have about a 12% chance of getting breast cancer at some point in their lives. During the highest risk years its a 2 percent chamce per year, so a machine with a 50% false positive for a 5 year prediction would still only be telling like 15% of women to be screened more often.
How would a false positive create more harm? Isn't it better to cast a wide net and detect more possible cases? Then false negatives are the ones that worry me the most.
It’s a common problem in diagnostics and it’s why mammograms aren’t recommended to women under 40.
Let’s say you have 10,000 patients. 10 have cancer or a precancerous lesion. Your test may be able to identify all 10 of those patients. However, if it has a false positive rate of 5% that’s around 500 patients who will now get biopsies and potentially surgery that they don’t actually need. Those follow up procedures carry their own risks and harms for those 500 patients. In total, that harm may outweigh the benefit of an earlier diagnosis in those 10 patients who have cancer.
Well it'd certainly benefit the medical industry. They'd be saddling tons of patients with surgeries, chemotherapy, mastectomy, and other treatments, "because doctor-GPT said so."
But imagine being a patient getting physically and emotionally altered, plunged into irrecoverable debt, distressing your family, and it all being a whoopsy by some black-box software.
That's a good point, that it could burden the system, but why would you ever put someone on chemotherapy for the model described in the paper? It seems more like it could burden the system by increasing the number of patients doing more frequent screening. Someone has to pay for all those docter-patient and meeting hours for sure. But the benefit outweighs this cost (which in my opinion is good and cheap since it prevents future treatment at later stages that are expensive).
Biopsies are small but still invasive. There's risk of infection or reactions to anesthesia in any surgery. If 100 million women get this test, a 5% false positive rate will mean 5 million unnecessary interventions. Not to mention the stress of being told you have cancer.
5 million unnecessary interventions means a small percentage of those people (thousands) will die or be harmed by the treatment. That's the harm that it causes.
You have really good point too! Maybe just an indication of higher risk, and just saying "Hey, screening more often couldn't hurt." Might actually be a net positive, and wouldn't warrant such extreme measures unless it was positively identified by, hopefully, human professionals.
You're right though, there always seems to be more demand than supply for anything medicine related. Not to mention, here in the U.S for example, needless extra screenings could also heavily impact a lot of people.
There's a lot to be considered here.
Serious question: is there a way to get access to medical imagery as a non-student? I would love to do some machine learning with it myself, as I see lot’s of potential in image analysis in general. 5 years ago I created a model that was able to spot certain types of ships based only on satellite imagery, which were not easily detectable by eye and ignoring the fact that one human cannot scan 15k images in one hour. Similar use case with medical imagery - seeing the things that are not yet detectable by human eyes.
Yeah there are some openly available datasets on competition sites like Kaggle, and some medical data is available through public institutions like like NIH.
I knew about kaggle, but not about NIH. Thanks for the hint!
Yeah there is. A bloke I know did exactly that with brain scans for his masters.
Would you mind asking your friend, so you can provide the source?
https://adni.loni.usc.edu/ here ya go
Edit: European DTI Study on Dementia too, he said it's easier to get data from there
Lovely, thank you very much, kind stranger!
This is similar to wat I did for my masters, except it was lung cancer.
Stuff like this is actually relatively easy to do, but the regulations you need to conform to and the testing you have to do first are extremely stringent. We had something that worked for like 95% of cases within a couple months, but it wasn't until almost 2 years later they got to do their first actual trial.
I had a housemate a couple of years ago who had a side job where she'd look through a load of these and confirm which were accurate. She didn't say it was AI though.
For a little while ours was used for this. Covid too. Client was under an alias and wasn't with us long so no idea.
Btw, my dentist used AI to identify potential problems in a radiograph. The result was pretty impressive. Have to get a filling tho.
This is a great use of tech. With that said I find that the lines are blurred between "AI" and Machine Learning.
Real Question: Other than the specific tuning of the recognition model, how is this really different from something like Facebook automatically tagging images of you and your friends? Instead of saying "Here's a picture of Billy (maybe) " it's saying, "Here's a picture of some precancerous masses (maybe)".
That tech has been around for a while (at least 15 years). I remember Picasa doing something similar as a desktop program on Windows.
I've been looking at the paper, some things about it: - the paper and article are from 2021 - the model needs to be able to use optional data from age, family history, etc, but not be reliant on it - it needs to combine information from multiple views - it predicts risk for each year in the next 5 years - it has to produce consistent results with different sensors and diverse patients - its not the first model to do this, and it is more accurate than previous methods
Good stuff
Kinda mean of you calling Billy precancerous masses like that smh
I don't care about mean but I would call it inaccurate. Billy is already cancerous, He's mostly cancer. He's a very dense, sour boy.
Everything machine learning will be called "ai" from now until forever.
It's like how all rc helicopters and planes are now "drones"
People en masse just can't handle the nuance of language. They need a dumb word for everything that is remotely similar.
It's because *AI* is the new buzzword that has replaced "machine learning" and "large language models", it sounds a lot more sexy and futuristic.
Besides LLMs, large language models, we also have GANs, Generative Adversarial Networks.
https://en.wikipedia.org/wiki/Large_language_model
https://en.wikipedia.org/wiki/Generative_adversarial_network
Where is the meme?
Well in Turkish, meme beans boob/breast.
The ai we got is the meme
pretty sure iterate is the wrong word choice there
They probably meant reiterate
I think it's a joke, like to imply they want to not just reiterate, but rerererereiterate this information, both because it's good news and also in light of all the sucky ways AI is being used instead. Like at first they typed, "I just want to reiterate.. but decided that wasn't nearly enough.
That's not the only issue with the English-esque writing.
100% true, just the first thing that stuck out at me
I suppose they just dropped the "re" off of "reiterate" since they're saying it for the first time.
Dude needs to use AI to fix his fucking grammar.
https://news.mit.edu/2024/ai-model-identifies-certain-breast-tumor-stages-0722
How soon could this diagnostic tool be rolled out? It sounds very promising given the seriousness of the DCIS!
Anything medical is slow but tools like this tend to get used by doctors not patients so it's much easier
As soon as your hospital system is willing to pay big money for it.
And if we weren't a big, broken mess of late stage capitalist hellscape, you or someone you know could have actually benefited from this.
I'm involved in multiple projects where stuff like this will be used in very accessible manners, hopefully in 2-3 years, so don't get too pessimistic.
Yea none of us are going to see the benefits. Tired of seeing articles of scientific advancement that I know will never trickle down to us peasants.
Our clinics are already using ai to clean up MRI images for easier and higher quality reads. We use ai on our cath lab table to provide a less noisy image at a much lower rad dose.
It never makes mistakes that affect diagnosis?
It’s not diagnosing, which is good imho. It’s just being used to remove noise and artifacts from the images on the scan. This means the MRI is clearer for the reading physician and ordering surgeon in the case of the MRI and that the cardiologist can use less radiation during the procedure yet get the same quality image in the lab.
I’m still wary of using it to diagnose in basically any scenario because of the salience and danger that both false negatives and false positives threaten.
... they said, typing on a tiny silicon rectangle with access to the whole of humanity's knowledge and that fits in their pocket...
The most beneficial application of AI like this is to reverse-engineer the neural network to figure out how the AI works. In this way we may discover a new technique or procedure, or we might find out the AI's methods are bullshit. Under no circumstance should we accept a "black box" explanation.
y = w^T x
hope this helps!
good luck reverse-engineering millions if not billions of seemingly random floating point numbers. It's like visualizing a graph in your mind by reading an array of numbers, except in this case the graph has as many dimensions as the neural network has inputs, which is the number of pixels the input image has.
Go learn at least basic principles of neural networks, because this your sentence alone makes me want to slap you.
Don't worry, researchers will just get an AI to interpret all those floating point numbers and come up with a human-readable explanation! What could go wrong? /s
Hey look, this took me like 5 minutes to find.
Censius guide to AI interpretability tools
Here's a good thing to wonder: if you don't know how you're black box model works, how do you know it isn't racist?
Here's what looks like a university paper on interpretability tools:
Oh yeah. I forgot about that. I hope your model is understandable enough that it doesn't get you in trouble with the EU.
Oh look, here you can actually see one particular interpretability tool being used to interpret one particular model. Funny that, people actually caring what their models are using to make decisions.
Look, maybe you were having a bad day, or maybe slapping people is literally your favorite thing to do, who am I to take away mankind's finer pleasures, but this attitude of yours is profoundly stupid. It's weak. You don't want to know? It doesn't make you curious? Why are you comfortable not knowing things? That's not how science is propelled forward.
interpretability costs money though :v
"Enough" is doing a fucking ton of heavy lifting there. You cannot explain a terabyte of floating point numbers. Same way you cannot *guarantee* a specific doctor or MRI technician isn't racist.
A single drop of water contains billions of molecules, and yet, we can explain a river. Maybe you should try applying yourself. The field of hydrology awaits you.
No, we cannot explain a river, or the atmosphere. Hence weather forecast is good for a few days and even after massive computer simulations, aircraft/cars/ships still need to do tunnel testing and real life testing. Because we only can approximate the real thing in our model.
iirc it recently turned out that the whole black box thing was actually a bullshit excuse to evade liability, at least for certain kinds of model.
Link?
This ones from 2019 Link
I was a bit off the mark, its not that the models they use aren't black boxes its just that they could have made them interpretable from the beginning and chose not to, likely due to liability.
Well in theory you can explain how the model comes to it's conclusion. However I guess that 0.1% of the "AI Engineers" are actually capable of that. And those costs probably 100k per month.
It depends on the algorithms used. Now the lazy approach is to just throw neural networks at everything and waste immense computation ressources. Of course you then get results that are difficult to interpret. There is much more efficient algorithms that are working well to solve many problems and give you interpretable decisions.
our brain is a black box, we accept that. (and control the outcomes with procedures, checklists, etc)
It feels like lots of prefessionals can't exactly explain every single aspect of how they do what they do, sometimes it just feels right.
What a vague and unprovable thing you've stated there.
IMO, the "black box" thing is basically ML developers hand waiving and saying "it's magic" because they know it will take way too long to explain all the underlying concepts in order to even start to explain how it works.
I have a very crude understanding of the technology. I'm not a developer, I work in IT support. I have several friends that I've spoken to about it, some of whom have made fairly rudimentary machine learning algorithms and neural nets. They understand it, and they've explained a few of the concepts to me, and I'd be lying if I said that none of it went over my head. I've done programming and development, I'm senior in my role, and I have a lifetime of technology experience and education... And it goes over my head. What hope does anyone else have? If you're not a developer or someone ML-focused, yeah, it's basically magic.
I won't try to explain. I couldn't possibly recall enough about what has been said to me, to correctly explain anything at this point.
The AI developers understand how AI works, but that does not mean that they understand the thing that the AI is trained to detect.
For instance, the cutting edge in protein folding (at least as of a few years ago) is Google's AlphaFold. I'm sure the AI researchers behind AlphaFold understand AI and how it works. And I am sure that they have an above average understanding of molecular biology. However, they do not understand protein folding better than the physisits and chemists who have spent their lives studying the field. The core of their understanding is "the answer is somewhere in this dataset. All we need to do is figure out how to through ungoddly amounts of compute at it, and we can make predictions". Working out how to productivly throw that much compute power at a problem is not easy either, and *that* is what ML researchers understand and are experts in.
In the same way, the researchers here understand how to go from a large dataset of breast images to cancer predictions, but that does not mean they have any understanding of cancer. And certainly not a better understanding than the researchers who have spent their lives studying it.
An open problem in ML research is how to take the billions of parameters that define an ML model and extract useful information that can provide insights to help human experts understand the system (both in general, and in understanding the reasoning for a specific classification). Progress has been made here as well, but it is still a long way from being solved.
Thank you for giving some insights into ML, that is now often just branded "AI". Just one note though. There is many ML algorithms that do not employ neural networks. They don't have billions of parameters. Especially in binary choice image recognition (looks like cancer or no) stuff like support vector machines achieve great results and they have very few parameters.
Machine learning is a subset of Artificial intelligence, which is a field of research as old as computer science itself
https://en.m.wikipedia.org/wiki/Artificial_intelligence
Yes, this is "how it was supposed to be used for".
The sentence construction quality these days in in freefall.
Not everyone's a native speaker.
*shrugs* you know people have been confidently making these kinds of statements... since written language was invented? I bet the first person who developed written language did it to complain about how this generation of kids don't know how to write a proper sentence.
What is in freefall is the economy for the middle and working class and basic idea that artists and writers should be compensated, period. What has released us into freefall is that making art and crafting words are shit on by society as not a respectable job worth being paid a living wage for.
There are a terrifying amount of good writers out there, more than there have ever been, both in total number AND per capita.
This isn't a creative writing project. This isn't an artist presenting their work. What in the world did that tangent even come from?
This is just plain speech, written objectively incorrectly.
But go on, I'm sure next I'll be accused of all the problems of the writing industry or something.
Ironically, if they'd used an LLM, it would have corrected their writing.
Lmao
Sure, I definitely overreacted and I honestly was pretty stressed out the day I replied so yeah, fair. I think I have a point, this just wasn't the salient place for it and I was too tired to realize that in the moment.
Objectively incorrect according to, who exactly?
Bro, it's Twitter
And that excuses it I guess.
That would be correct, yes.
Twitter: Where wrongness gathers and imagines itself to be right.
The AI genie is out of the bottle and — as much as we complain — it isn't going away; we need thoughtful legislation. AI is going to take my job? Fine, I guess? That sounds good, really. Can I have a guaranteed income to live on, because I still need to live? Can we tax the rich?
Can't pigeons do the same thing?
Nooooooo you're supposed to use AI for good things and not to use it to generate meme images.
I think you mean mammary images?
Not my proudest fap...
That's a challenging wank
Man I miss him.
Honestly with all respect that is really shitty joke. It’s god damn breast cancer, opposite of hot
I usually just skip them mouldy jokes but like cmon that is beyond the scale of cringe
Terrible things happen to people you love, you have two choices in this life. You can laugh about it or you can cry about it. You can do one and then the other if you choose. I prefer to laugh about most things and hope others will do the same. Cheers.
I mean do whatever you want but it just comes off as repulsive. like a stain of shit on the new shoes.
This is public space after all, not the bois locker room so that might be embarrassing for you.
And you know you can always count on me to point stuff out so you can avoid humiliation in the future
Thanks for your excessively unnecessary put down. Don't worry though. No matter how hard you try, you won't be able to stop me from enjoying my life and bringing joy to others. Why are you obsessed with shit btw?
Sorry for that comment, I had shitty time back then and shouldn’t be so aggressive to you PlantDad
https://youtube.com/shorts/xIMlJUwB1m8?si=zH6eF5xZ5Xoz_zsz
Detecting is not enough to be useful.
The test is 90% accurate, thats still pretty useful. Especially if you are simply putting people into a high risk group that needs to be more closely monitored.
“90% accurate” is a non-statement. It’s like you haven’t even watched the video you respond to. Also, where the hell did you pull that number from?
How specific is it and how sensitive is it is what matters. And if Mirai in https://www.science.org/doi/10.1126/scitranslmed.aba4373 is the same model that the tweet mentions, then neither its specificity nor sensitivity reach 90%. And considering that the image in the tweet is trackable to a publication in the same year (https://news.mit.edu/2021/robust-artificial-intelligence-tools-predict-future-cancer-0128), I’m fairly sure that it’s the same Mirai.
Well, you can just do the math yourself, it's pretty straight-forward.
However, more to the point, it's taken right from around 38 seconds into the video. Kind of funny to be accused of "not watching the video" by someone who is implying the number was pulled from nowhere, when it's right in the video.
I certainly don't think this closes the book on anything, but I'm responding to your claim that it's not useful. If this is a cheap and easy test, it's a great screening tool putting people into groups of low risk/high risk for which further, maybe more expensive/specific/sensitive, tests can be done. Especially if it can do this early.
Wanna bet it’s not “AI” ?
Learning machines are ai as well, it's not really what we picture when we think ai but it is none the less.
It's probably more "AI" than the LLMs we've been plagued with. This sounds more like an application of machine learning, which is a hell of a lot more promising.
AI and machine learning are very similar (if not identical) things, just one has been turned into a marketing hype word a whole lot more than the other.
Machine learning is one of the many things that is referred to by "AI", yes.
My thought is the term "AI" has been overused to uselessness, from the nested if statements that decide how video game enemies move to various kinds of machine learning to large language models.
So I'm personally going to avoid the term.
AI == Computer Thingy that looks kinda "smart" to people that don't understand it. it's like rectangles and squares. you should use the more precise word (CNN, LLM, Stable diffusion) when applicable, just like with rectangles and squares
This seems exactly like what I would have referred to as AI before the pandemic. Specifically Deep Learning image processing. In terms of something you can buy off the shelf this is theoretically something the Cognex Vidi Red Tool could be used for. My experience with it is in packaging, but the base concept is the same.
Training a model requires loading images into the software and having a human mark them before having a very powerful CUDA GPU process all of that. Once the model has been trained it can usually be run on a fairly modest PC in comparison.
I really wouldn't call this AI. It is more or less an inage identification system that relies on machine learning.
That was pretty much the definition of AI before LLM came.
And much before that it was rule-based machine learning, which was basically databases and fancy inference algorithms. So I guess "AI" has always meant "the most advanced computer science thing which looks kind of intelligent". It's only now that it looks intelligent enough to fool laypeople into thinking there actually is intelligence there.
Haha I love Gell-Mann amnesia. A few weeks ago there was news about speeding up the internet to gazillion bytes per nanosecond and it turned out to be fake.
Now this thing is all over the internet and everyone believes it.
The source paper is available online, is published in a peer reviewed journal, and has over 600 citations. I'm inclined to believe it.
You sound like a person who hasn't been peer reviewed
Well one reason is that this is basically exactly the thing current AI is perfect for - detecting patterns.
Good news, but it's not "AI". Please stop calling it that.
it's ai, it all is. the code that controls where the creepers in Minecraft go? AI. the tiny little neural network that can detect numbers? also AI! is it AGI? no. it's still AI. it's not that modern tech is stealing the term ai, scifi movies are the ones that started misusing it and cash grab startups are riding the hypetrain.
Fair weather friends to AI crack me up.