Poisoning attacks on AI

* Register (or log in) to the AI4G Neural Network to add this session to your agenda or watch the replay

Battista Biggio (University of Cagliari) will present his research on Poisoning Attacks on AI as part of the Trustworthy AI series.

WHAT IS TRUSTWORTHY AI SERIES?

Artificial Intelligence (AI) systems have steadily grown in complexity, gaining predictivity often at the expense of interpretability, robustness and trustworthiness. Deep neural networks are a prime example of this development. While reaching “superhuman” performances in various complex tasks, these models are susceptible to errors when confronted with tiny (adversarial) variations of the input – variations which are either not noticeable or can be handled reliably by humans. This expert talk series will discuss these challenges of current AI technology and will present new research aiming at overcoming these limitations and developing AI systems which can be certified to be trustworthy and robust.

The expert talk series will cover the following topics:

Measuring Neural Network Robustness
Auditing AI Systems
Adversarial Attacks and Defences
Explainability & Trustworthiness
Poisoning Attacks on AI
Certified Robustness
Model and Data Uncertainty
AI Safety and Fairness

The Trustworthy AI series is moderated by Wojciech Samek, Head of AI Department at Fraunhofer HHI, one of the top 20 AI labs in the world.

Panelists
Resources

Battista Biggio: Poisoning attacks on AI

Shownotes

00:00 Opening remarks by ITU

00:00:58 Introduction by Wojciech Samek

01:37 Introduction by Battista Biggio – Poisoning Attacks on AI

02:50 Artificial Intelligence Today

Comparing AI vs electricity on how the first one will change once more industrial society.

04:11 is AI really smart?

Can we trust this technology?
Are we happy with current results?
We cannot trust in AI yet.

04:59 Adversarial Examples – (Gradient-based Evasion Attacks)

Input image -> Adversarial perturbation (noise) -> Not trustable output.

06:31 Not only in the digital domain

This is not only applicable in the digital domain, but also in the physical field.
Example of recognising a stop sign when driving a car.

07:51 Other applicable domain

Audio: Sometimes when having digital audio, noise can make it difficult to recognize sentences.
Malware Examples: PDF, Android, Windows.

11:04 Timeline of Learning Security

This is good to learn the history of technology and to know where it is heading to.

12:21 Attacks against Machine Learning

Evasion attacks which can be adversarial examples (noise).
Sponge attacks, which means they affect the time consuming of a model/system.
Model extraction/model inversion/membership inference, which are the one that reveal information of users (lack of privacy).

14:56 Poisoning attacks

Denial-of-service poisoning attacks. Example: a person used 99 phones to trick google into traffic jam alerts.
How does it work? Training data -> Preprocessing -> classifier -> output (filter information).
Goal: to maximize classification error by injecting poisoning samples into TR.
Strategy: find an optimal attack point in TR that maximizes classification error.

21:00 Poisoning is a Bilevel Optimization problem

Attacker’s objective: to maximize generalization error on untainted data, w.r.t. posisong point Xc.
You have to look for the maximum level of Xc to find the best filter to minimize error output.

22:43 Bilevel Optimization

To find the max Xc, get the gradient of the function of Xc.

23:58 Gradient-based poisoning attacks

Gradient is not easy to compute: the training point affects the classification function.
To solve it: replace the inner learning problem with its equilibrium (KKT) conditions.
This enables computing gradients in closed form.

25:19 Experiment on MNIST digits.

26:01 is bilevel optimization really needed?

26:48 Towards poisoning deep neural networks

Solving the poisoning problem without exploiting KKT conditions (back gradient).
This is to solve bilevel problems in a more efficient way.
This algorithm is more difficult to achieve but has not been demonstrated yet.

28:33 Poisoning attacks on algorithm fairness

29:13 Why do adversarial attacks transfer?

It means the ability of an attack developed against a surrogate model to succeed also against a different target model.
It depends on the vulnerability of the target model and the alignment of gradients.

30:26 Countering Poisoning attacks

Security Measures against poisoning: Rationale, which means injecting outlying training samples.
Two strategics: data sanitization, by removing poisoning samples from training data; robust learning, by learning algorithms that are robust

32:20 Robust regression with TRIM statistics

TRIM learns the model by retaining only training points with the smallest residuals.
Start with a condition and after several iterations you can remove noise.
Experiments with TRIM (Loan dataset)

34:10 Strength-detectability dilemma for poisoning attacks

Examples

34:35 Backdoor attacks

Clean training data: ideal
Backdoor: poisoning integrity attacks place mislabeled training points in a region of the feature space far from the rest of training data. The learning algorithm labels such regios as desired, allowing for subsequent intrusions.

37:18 Backdoor poisoning: three main categories

BadNets: Training data with trigger
Hidden Trigger: No trigger
Poison Frogs, Convex polytope, Bulseye Polytope: targets a predefined class/sample

39:50 Defending against backdoor poisoning attacks

Process

Blind backdoor removal
Offline inspection
Online inspection
Post backdoor removal

40:50 Ongoing work: backdoor smoothing

Why do things work or don’t?
Randomized smoothing to measure the instability/variability of the classification output around backdoored samples
Backdoor attacks are more successful if they are able to induce smoother classification around the backdoor samples.

42:23 Why is AI vulnerable?

Bernhard Scholkopf: Underlying assumption (past data is representative of future data, it means, data is not stationary.
The success of modern AI is on task for which we collect enough representative training data.
We cannot build AI models for each task an agent is ever going to encounter, but there is a whole world out there where IID assumption is violated.

44:41 What can we do, then?

We lack testing/debugging/monitoring tools, to understand better how these algorithms work.

45:30 Conclusion

46:04 Q&A Session

47:30 What can you say about scalability? There is still research on process to avoid scalability.

48:25 What do you mean when backdoor attacks are becoming popular? They are one of the most threatening attacks in the field, therefore they are constantly in observation. Developers are trying to compromise the learning process to create defense algorithms.

50:50 Best way to use classifiers? Combining all the classifiers will robust your system, but it depends on how you combine things.

52:33 Would you trust in a model or there are methods to know if a model is already poisoned? I can do so only if a model is already certified. I would use something that is not certified if the model is not critical for a company or anyone else.

54:16 How likely is that an attacker is capable of injecting samples during training? A classifier can be poisoned. This is a very challenging process.

55:43 Would stacking be somewhat better than a generic ensemble method? You have to have garantize in the classifier you are using to have an optimal combiner. This will give a more robust system.

56:59 How can you obtain a certification? There are companies that are involved in the security aspect of machine learning algorithms. Poisoning attacks, etc. But if you want something robust, companies must focus on the quality of the algorithm.

1:02:24 Closing Q&A Session

1:02:34 Closing from ITU

For important information regarding the classification, please go to the Division’s website and review the last two questions in the Q&A page. Please be advised that the utilization of this list by AI for Good is exclusively for the purpose of ticketing for the 2024 AI for Good Global Summit, unless otherwise specified

Country or Area	ISO-alpha2 Code	ISO-alpha3 Code	Developed / Developing regions
Algeria	DZ	DZA	Developing
Egypt	EG	EGY	Developing
Libya	LY	LBY	Developing
Morocco	MA	MAR	Developing
Sudan	SD	SDN	Developing
Tunisia	TN	TUN	Developing
Western Sahara	EH	ESH	Developing
British Indian Ocean Territory	IO	IOT	Developing
Burundi	BI	BDI	Developing
Comoros	KM	COM	Developing
Djibouti	DJ	DJI	Developing
Eritrea	ER	ERI	Developing
Ethiopia	ET	ETH	Developing
French Southern Territories	TF	ATF	Developing
Kenya	KE	KEN	Developing
Madagascar	MG	MDG	Developing
Malawi	MW	MWI	Developing
Mauritius	MU	MUS	Developing
Mayotte	YT	MYT	Developing
Mozambique	MZ	MOZ	Developing
Réunion	RE	REU	Developing
Rwanda	RW	RWA	Developing
Seychelles	SC	SYC	Developing
Somalia	SO	SOM	Developing
South Sudan	SS	SSD	Developing
Uganda	UG	UGA	Developing
United Republic of Tanzania	TZ	TZA	Developing
Zambia	ZM	ZMB	Developing
Zimbabwe	ZW	ZWE	Developing
Angola	AO	AGO	Developing
Cameroon	CM	CMR	Developing
Central African Republic	CF	CAF	Developing
Chad	TD	TCD	Developing
Congo	CG	COG	Developing
Democratic Republic of the Congo	CD	COD	Developing
Equatorial Guinea	GQ	GNQ	Developing
Gabon	GA	GAB	Developing
Sao Tome and Principe	ST	STP	Developing
Botswana	BW	BWA	Developing
Eswatini	SZ	SWZ	Developing
Lesotho	LS	LSO	Developing
Namibia	NA	NAM	Developing
South Africa	ZA	ZAF	Developing
Benin	BJ	BEN	Developing
Burkina Faso	BF	BFA	Developing
Cabo Verde	CV	CPV	Developing
Côte d’Ivoire	CI	CIV	Developing
Gambia	GM	GMB	Developing
Ghana	GH	GHA	Developing
Guinea	GN	GIN	Developing
Guinea-Bissau	GW	GNB	Developing
Liberia	LR	LBR	Developing
Mali	ML	MLI	Developing
Mauritania	MR	MRT	Developing
Niger	NE	NER	Developing
Nigeria	NG	NGA	Developing
Saint Helena	SH	SHN	Developing
Senegal	SN	SEN	Developing
Sierra Leone	SL	SLE	Developing
Togo	TG	TGO	Developing
Anguilla	AI	AIA	Developing
Antigua and Barbuda	AG	ATG	Developing
Aruba	AW	ABW	Developing
Bahamas	BS	BHS	Developing
Barbados	BB	BRB	Developing
Bonaire, Sint Eustatius and Saba	BQ	BES	Developing
British Virgin Islands	VG	VGB	Developing
Cayman Islands	KY	CYM	Developing
Cuba	CU	CUB	Developing
Curaçao	CW	CUW	Developing
Dominica	DM	DMA	Developing
Dominican Republic	DO	DOM	Developing
Grenada	GD	GRD	Developing
Guadeloupe	GP	GLP	Developing
Haiti	HT	HTI	Developing
Jamaica	JM	JAM	Developing
Martinique	MQ	MTQ	Developing
Montserrat	MS	MSR	Developing
Puerto Rico	PR	PRI	Developing
Saint Barthélemy	BL	BLM	Developing
Saint Kitts and Nevis	KN	KNA	Developing
Saint Lucia	LC	LCA	Developing
Saint Martin (French Part)	MF	MAF	Developing
Saint Vincent and the Grenadines	VC	VCT	Developing
Sint Maarten (Dutch part)	SX	SXM	Developing
Trinidad and Tobago	TT	TTO	Developing
Turks and Caicos Islands	TC	TCA	Developing
United States Virgin Islands	VI	VIR	Developing
Belize	BZ	BLZ	Developing
Costa Rica	CR	CRI	Developing
El Salvador	SV	SLV	Developing
Guatemala	GT	GTM	Developing
Honduras	HN	HND	Developing
Mexico	MX	MEX	Developing
Nicaragua	NI	NIC	Developing
Panama	PA	PAN	Developing
Argentina	AR	ARG	Developing
Bolivia (Plurinational State of)	BO	BOL	Developing
Bouvet Island	BV	BVT	Developing
Brazil	BR	BRA	Developing
Chile	CL	CHL	Developing
Colombia	CO	COL	Developing
Ecuador	EC	ECU	Developing
Falkland Islands (Malvinas)	FK	FLK	Developing
French Guiana	GF	GUF	Developing
Guyana	GY	GUY	Developing
Paraguay	PY	PRY	Developing
Peru	PE	PER	Developing
South Georgia and the South Sandwich Islands	GS	SGS	Developing
Suriname	SR	SUR	Developing
Uruguay	UY	URY	Developing
Venezuela (Bolivarian Republic of)	VE	VEN	Developing
Kazakhstan	KZ	KAZ	Developing
Kyrgyzstan	KG	KGZ	Developing
Tajikistan	TJ	TJK	Developing
Turkmenistan	TM	TKM	Developing
Uzbekistan	UZ	UZB	Developing
China	CN	CHN	Developing
China, Hong Kong Special Administrative Region	HK	HKG	Developing
China, Macao Special Administrative Region	MO	MAC	Developing
Democratic People’s Republic of Korea	KP	PRK	Developing
Mongolia	MN	MNG	Developing
Brunei Darussalam	BN	BRN	Developing
Cambodia	KH	KHM	Developing
Indonesia	ID	IDN	Developing
Lao People’s Democratic Republic	LA	LAO	Developing
Malaysia	MY	MYS	Developing
Myanmar	MM	MMR	Developing
Philippines	PH	PHL	Developing
Singapore	SG	SGP	Developing
Thailand	TH	THA	Developing
Timor-Leste	TL	TLS	Developing
Viet Nam	VN	VNM	Developing
Afghanistan	AF	AFG	Developing
Bangladesh	BD	BGD	Developing
Bhutan	BT	BTN	Developing
India	IN	IND	Developing
Iran (Islamic Republic of)	IR	IRN	Developing
Maldives	MV	MDV	Developing
Nepal	NP	NPL	Developing
Pakistan	PK	PAK	Developing
Sri Lanka	LK	LKA	Developing
Armenia	AM	ARM	Developing
Azerbaijan	AZ	AZE	Developing
Bahrain	BH	BHR	Developing
Georgia	GE	GEO	Developing
Iraq	IQ	IRQ	Developing
Jordan	JO	JOR	Developing
Kuwait	KW	KWT	Developing
Lebanon	LB	LBN	Developing
Oman	OM	OMN	Developing
Qatar	QA	QAT	Developing
Saudi Arabia	SA	SAU	Developing
State of Palestine	PS	PSE	Developing
Syrian Arab Republic	SY	SYR	Developing
Turkey	TR	TUR	Developing
United Arab Emirates	AE	ARE	Developing
Yemen	YE	YEM	Developing
Fiji	FJ	FJI	Developing
New Caledonia	NC	NCL	Developing
Papua New Guinea	PG	PNG	Developing
Solomon Islands	SB	SLB	Developing
Vanuatu	VU	VUT	Developing
Guam	GU	GUM	Developing
Kiribati	KI	KIR	Developing
Marshall Islands	MH	MHL	Developing
Micronesia (Federated States of)	FM	FSM	Developing
Nauru	NR	NRU	Developing
Northern Mariana Islands	MP	MNP	Developing
Palau	PW	PLW	Developing
United States Minor Outlying Islands	UM	UMI	Developing
American Samoa	AS	ASM	Developing
Cook Islands	CK	COK	Developing
French Polynesia	PF	PYF	Developing
Niue	NU	NIU	Developing
Pitcairn	PN	PCN	Developing
Samoa	WS	WSM	Developing
Tokelau	TK	TKL	Developing
Tonga	TO	TON	Developing
Tuvalu	TV	TUV	Developing
Wallis and Futuna Islands	WF	WLF	Developing

Poisoning attacks on AI