Similar presentations:
Business Analytics in Python (in English)
1.
Presentation for the course: «Business Analytics in Python (inEnglish)»
on the topic: «Anomaly Detection in Transactions: Isolation
Forest (Scikit-learn) vs. Autoencoders.
Benchmarking on fintech datasets with PyOD and TensorFlow.»
Prepared by:
Student of group15.27Д-БИ06/24м
Full-time program ВШКМиС
Dorozhkin Sergei Dmitrievich
Checked by:
Peleshenko Vitaly Alekseevich
Moscow – 2025
2.
Detecting anomalies in financialtechnologies
In the rapidly evolving world of financial technology, where digital transaction
volumes are growing exponentially, anomaly detection is crucial to protect
against fraud. In this presentation, we will compare two effective methods:
Isolation Forest
Autoencoders
We will examine their effectiveness and determine in which cases it is better to
use each of them to ensure the security of financial transactions.
3.
What is Fintech and why is anomaly detectionimportant?
What is Fintech?
The importance of anomaly detection
Fintech is a set of advanced digital technologies used
in the development of financial services and their
provision to customers. This includes online payments,
loans, insurance, investments, and security.
In Fintech, any deviations from the norm may indicate
fraudulent activity or a system error. Therefore, all
transactions must be checked for anomalies using
various methods.
4.
Problems of anomaly detectionin Fintech
High dimensionality of
Data
Constantly evolving fraud
patterns
Each transaction is described
by dozens of features, such
as amount, time, geolocation,
and others.
Malicious actors adapt their
strategies, and the behavior
of regular users also changes
over time.
Class imbalance
Anomalies are rare, making it challenging to train models.
5.
Isolation ForestPrinciple of Operation
Isolation Forest constructs trees for a specific dataset.
Anomalies are positioned in these trees so that their path is
shorter than others because the algorithm requires fewer
steps to isolate an anomalous instance than a normal one
Advantages
Isolation Forest does not require defining a metric or
additional information about the data structure.
It performs more efficiently compared to most other
algorithms.
It does not require significant memory usage.
6.
AutoencodersPrinciple of Operation
The algorithm compresses
the data and then
reconstructs it, comparing
the output with the original
input. Large reconstruction
errors indicate anomalies.
Architecture
It consists of an input layer,
Autoencoders are particularly well-suited for detecting unusual, atypical data
an encoder, a latent space,
patterns in complex, nonlinear structures.
a decoder, and an output
layer.
7.
Comparison of Isolation Forest and Autoencoders1
Performance
Isolation Forest trains quickly even on small datasets, but its
speed decreases on large datasets. Autoencoder requires more
computational resources but scales efficiently on GPUs, making
it preferable for processing large datasets.
2
3
Accuracy
Isolation Forest detects point anomalies well, but performs worse
with group attacks. Autoencoder identifies complex nonlinear
dependencies.
Interpretability
Isolation Forest allows assessing the contribution of each feature
to an anomaly. Autoencoder functions as a "black box."
8.
When to Use Isolation Forest?Data
Moderate dimensionality
Time
Fast implementation required
Resources
No access to GPU
Analysis
Transactions are analyzed in real-time
9.
When to Use Autoencoders?Data
2
Anomalies
Available for training a neural
network
1
Non-obvious patterns
Adaptability
3
Adaptation to changing conditions
is required
10.
Tools for Implementation: PyOD and TensorFlowPyOD
Unifies dozens of algorithms under a single interface, simplifying experiments and
result visualization.
TensorFlow
Provides flexibility in designing neural models and optimizes
computations on GPUs.
11.
Conclusion: Choosing the RightMethod
1
Isolation Forest
High speed and interpretability for startups and real-time systems.
2
Autoencoders
Effective for large organizations with access to computational
resources and tasks involving non-obvious patterns.
The choice of method depends on specific requirements and constraints.
It is crucial to study anomaly detection methods and approaches to find
the most suitable solution for each case.
12.
Sources:1. Кочкаров, Д. Э. Особенности и перспективы финтех на российском рынке и их использование в противодействии корпоративному мошенничеству / Д. Э.
Кочкаров, Л. П. Королева // Детерминанты развития экономики и общества в условиях глобальных изменений : сборник статей I международной научнопрактической конференции, посвященной 295-летию со дня рождения К.Г. Разумовского, Москва, 27–28 апреля 2023 года / ФГБОУ ВО «Московский
государственный университет технологий и управления имени К.Г. Разумовского (Первый Казачий университет)». – Москва: Закрытое акционерное
общество "Университетская книга", 2023. – С. 285-289.
2. Цыба, Е. Н. Опыт применения автоэнкодеров при решении задач обнаружения аномалий во временных рядах измерительной информации / Е. Н. Цыба, О.
А. Волкова, Н. А. Вострухов // Альманах современной метрологии. – 2024. – № 2(38). – С. 150-160.
3. Барский, М. Е. Исследование алгоритма поиска аномалий isolation forest / М. Е. Барский, А. Н. Шиков // Фундаментальные и прикладные научные
исследования: актуальные вопросы, достижения и инновации : сборник статей XXIII Международной научно-практической конференции : в 3 ч., Пенза, 15
мая 2019 года. Том Часть 1. – Пенза: "Наука и Просвещение" (ИП Гуляев Г.Ю.), 2019. – С. 113-117.
4. Сравнительное исследование эффективности автоэнкодеров в задачах обнаружения аномалий / В. Е. Марлей, А. Н. Терехов, Ю. А. Гатчин [и др.] //
Нейрокомпьютеры: разработка, применение. – 2024. – Т. 26, № 5. – С. 96-106.
5. Сафронов, Д. А. Поиск аномалий с помощью автоэнкодеров / Д. А. Сафронов, Ю. Д. Кацер, К. С. Зайцев // International Journal of Open Information
Technologies. – 2022. – Т. 10, № 8. – С. 39-45.
6. pyod 2.0.3 documentation. Welcome to PyOD V2 documentation! URL: https://pyod.readthedocs.io/en/latest/
7. Analytics vidhya. Anomaly detection using Isolation Forest – A Complete Guide. URL: https://www.analyticsvidhya.com/blog/2021/07/anomaly-detection-usingisolation-forest-a-complete-guide/
8. Medium. Anomaly Detection with PyOD. URL: https://medium.com/data-science/anamoly-detection-with-pyod-fea90f0b4b42
13.
Thank you for your attention!Prepared by:
Student of group15.27Д-БИ06/24м
Full-time program ВШКМиС
Dorozhkin Sergei Dmitrievich
business