Hi, I’m Mehdi Fekih
I specialize in the field of data science, where I analyze and interpret complex data to derive meaningful insights.
What I Can Do For You
Data Science
Unlocking insights and driving business growth through data analysis and visualization as a freelance data scientist.
Data Analysis
Helping businesses make informed decisions through insightful data analysis as a freelance data analyst.
Data Engineering
I'll help you ensure your business has a robust, scalable data infrastructure, enabling seamless data integration, storage, and analysis for actionable insights and growth.
Home Automation
Transforming your living space into a smart home with custom home automation solutions as a freelance home automation expert.
Docker & Server
Optimizing your software development and deployment with Docker and server management as a freelance expert
Consultancy
Identification of scope, assessment of feasibility, cleaning and preparation of data, selection of tools and algorithms.
My Portfolio
The world of nutrition is vast and ever-evolving. With a plethora of food products available in the market, making informed choices becomes a challenge. Recognizing this, the agency “Santé publique France” initiated a call for innovative application ideas related to nutrition. We heeded the call and embarked on a journey to develop a unique application that not only provides insights into various food products but also suggests healthier alternatives.
The Inspiration Behind the App
The “Santé publique France” agency, dedicated to public health in France, recognized the need for innovative solutions to address the challenges of modern nutrition. With the vast dataset from Open Food Facts at our disposal, we saw an opportunity to create an application that could make a difference.
Our Streamlit App: Features & Highlights
Our application is designed to be user-friendly and informative:
- Product Insights: Users can select a product and view its comprehensive nutritional information.
- Healthier Alternatives: The app suggests healthier alternatives based on user-defined criteria and the product’s nutritional score.
- Visual Analytics: Throughout the app, users are presented with clear and concise visualizations, making the data easy to understand, even for those new to nutrition.
The Technical Journey
- Data Processing: The Open Food Facts dataset was our primary resource. We meticulously cleaned and processed this data, ensuring its reliability. This involved handling missing values, identifying outliers, and automating these processes for future scalability.
- Univariate & Multivariate Analysis: We conducted thorough analyses to understand individual variables and their interrelationships. This was crucial in developing the recommendation engine for healthier alternatives.
- Application Ideation: Drawing inspiration from real-life scenarios, like David’s sports nutrition focus, Aurélia’s quest for healthier chips, and Sylvain’s comprehensive product rating system, we envisioned an app that catered to diverse user needs.
Deployment with Docker
To ensure our application’s consistent performance across different environments, we turned to Docker. Docker allowed us to containerize our app, ensuring its smooth and consistent operation irrespective of the deployment environment.
Conclusion & Future Prospects
Our Streamlit app is a testament to the power of data-driven solutions in addressing real-world challenges. By leveraging the Open Food Facts dataset and the simplicity of Streamlit, we’ve created an application that empowers users to make informed nutritional choices. As we look to the future, we’re excited about the potential enhancements and the broader impact our app can have on public health.
Automatic Speech Recognition: A Streamlit and Whisper Integration
Introduction:
In the ever-evolving landscape of technology, Automatic Speech Recognition (ASR) stands out as a pivotal advancement, turning spoken language into written text. In this article, we delve into a Python script that seamlessly integrates OpenAI’s Whisper ASR with Streamlit, a popular web app framework for Python, to transcribe audio files and present the results in a user-friendly interface.
Script Overview:
The script in focus utilizes several Python libraries, including os
, pathlib
, whisper
, streamlit
, and pydub
, to create a web application capable of converting uploaded audio files into text transcripts. The application supports a variety of audio formats such as WAV, MP3, MP4, OGG, WMA, AAC, FLAC, and FLV.
Key Components:
- Directory Setup: The script defines three main directories:
UPLOAD_DIR
for storing uploaded audio files,DOWNLOAD_DIR
for saving converted MP3 files, andTRANSCRIPT_DIR
for keeping the generated transcripts. - Audio Conversion: The
convert_to_mp3
function is responsible for converting the uploaded audio file into MP3 format, regardless of its original format. This is achieved using a mapping of file extensions to corresponding conversion methods provided by thepydub
library. - Transcription Process: The
transcribe_audio
function leverages OpenAI’s Whisper ASR model to transcribe the converted audio file. Users have the option to choose the model type (Tiny, Base, Small) for transcription. - Transcript Storage: The
write_transcript
function writes the generated transcript to a text file, stored in theTRANSCRIPT_DIR
. - User Interface: Streamlit is employed to create an intuitive user interface, allowing users to upload audio files, choose the ASR model, generate transcripts, and download the results. The interface also provides playback functionality for the uploaded audio file.
Usage:
- Uploading Audio File: Users can upload their audio file through the Streamlit interface, where they are prompted to choose the file and the ASR model type.
- Generating Transcript: Upon clicking the “Generate Transcript” button, the script processes the audio file, transcribes it using the selected Whisper model, and displays a formatted transcript in a toggleable section.
- Downloading Transcript: Users have the option to download the generated transcript as a text file directly from the application.
Conclusion:
This innovative script exemplifies the integration of Automatic Speech Recognition technology with web applications, offering a practical solution for transcribing audio files. By combining the capabilities of OpenAI’s Whisper and Streamlit, it provides a versatile tool that caters to a wide range of audio formats and user preferences. Whether for academic research, content creation, or accessibility, this application stands as a testament to the boundless possibilities of ASR technology in enhancing digital communication.
Image Embedding with VGG16: A working example with grocery products
powered by Advanced iFrame. Get the Pro version on CodeCanyon.
Credit Scoring Project for Credit Match
Background: In the face of the rapidly evolving financial market and the increasing demand for transparency from clients, Credit Match sought our expertise to develop an innovative credit scoring solution. The goal was twofold: to optimize the credit granting decision process and to enhance client relations through transparent communication.
Proposed Solution:
- Automated Scoring Model: We crafted an advanced classification algorithm that leverages a myriad of data sources, including behavioral data and information from third-party financial entities. This algorithm is adept at accurately predicting the likelihood of a client repaying their credit.
- Interactive Dashboard: Addressing the need for transparency, we designed an interactive dashboard tailored for client relationship managers. This dashboard not only elucidates credit granting decisions but also provides clients with easy access to their personal information.
Technologies Deployed:
- Analysis and Modeling: We utilized Kaggle kernels to facilitate exploratory analysis, data preparation, and feature engineering. These kernels were adapted to meet the specific needs of Credit Match.
- Dashboard: Based on the provided specifications, we chose [Dash/Bokeh/Streamlit] to develop the interactive dashboard.
- MLOps: To ensure regular and efficient model updates, we implemented an MLOps approach, relying on Open Source tools.
- Data Drift Detection: The evidently library was integrated to anticipate and detect any future data discrepancies, thus ensuring the model’s long-term robustness.
Deployment: The solution was deployed on [Azure webapp/PythonAnywhere/Heroku], ensuring optimal availability and performance.
Results: Our model demonstrated outstanding performance, with an AUC exceeding 0.82. However, we took precautions to avoid any overfitting. Moreover, considering Credit Match’s business specifics, we optimized the model to minimize costs associated with prediction errors.
Documentation: A detailed technical note was provided to Credit Match, allowing for transparent sharing of our approach, from model conception to Data Drift analysis.
Analysis of Data from Education Systems
Introduction
This project is an analysis of data from education systems, aimed at providing insights into the trends and patterns of the education sector. The data is sourced from various educational institutions, and the project aims to extract meaningful information from this data to inform education policies and practices.
Getting Started
To get started with this project, you will need to have a basic understanding of data analysis and data visualization tools. The project is written in Python, and you will need to have Python 3.x installed on your system.
You can download the source code from the GitHub repository and install the necessary dependencies using pip. Once you have installed the dependencies, you can run the project using the command line.
Project Structure
The project is organized into several directories, each of which contains code related to a specific aspect of the project. The directories are as follows:
data
: contains the data used in the project.notebooks
: contains Jupyter notebooks used for data analysis and visualization.scripts
: contains Python scripts used for data preprocessing and analysis.reports
: contains reports generated from the data analysis.
Leveraging NLP for PDF Content Q&A with Streamlit and OpenAI
Introduction
In the vast ocean of unstructured data, PDFs stand out as one of the most common and widely accepted formats for sharing information. From research papers to company reports, these files are ubiquitous. But with the ever-growing volume of information, navigating and extracting relevant insights from these documents can be daunting. Enter our recent project: a Streamlit application leveraging OpenAI to answer questions about the content of uploaded PDFs. In this article, we’ll dive into the technicalities and the exciting outcomes of this endeavor.
The Challenge
While PDFs are great for preserving the layout and formatting of documents, extracting and processing their content programmatically can be challenging. Our goal was simple but ambitious: develop an application where users can upload a PDF and then ask questions related to its content, receiving relevant answers in return.
The Stack
- Streamlit: A fast, open-source tool that allows developers to create machine learning and data applications in a breeze.
- OpenAI: Leveraging the power of NLP models for text embeddings and semantic understanding.
- PyPDF2: A Python library to extract text from PDF files.
- langchain (custom modules): For text splitting, embeddings, and more.
The Process
- PDF Upload and Text Extraction:
Once a user uploads a PDF, we usePyPDF2
to extract its text content, ensuring the preservation of the sequence of words. - Text Splitting:
Given that PDFs can be extensive, we implemented theCharacterTextSplitter
fromlangchain
to break down the text into manageable chunks. This modular approach ensures efficiency and high-quality results in the subsequent steps. - Text Embedding:
We employedOpenAIEmbeddings
fromlangchain
to convert these chunks of text into vector representations. These embeddings capture the semantic essence of the text, paving the way for accurate similarity searches. - Building the Knowledge Base:
UsingFAISS
fromlangchain
, we constructed a knowledge base from the embeddings of the chunks, ensuring a swift and efficient retrieval process. - User Q&A:
With the knowledge base in place, users can pose questions about the uploaded PDF. By performing a similarity search within our knowledge base, we retrieve the most relevant chunks corresponding to the user’s query. - Answer Extraction:
Leveraging OpenAI, we implemented a question-answering mechanism, providing users with precise answers to their questions based on the content of the PDF.
Outcomes and Reflections
The Streamlit application stands as a testament to the power of combining user-friendly interfaces with potent NLP capabilities. While our project showcases significant success in answering questions about the content of a wide range of PDFs, there are always challenges:
- Quality of Text Extraction: Some PDFs, especially those with images, tables, or non-standard fonts, may not yield perfect text extraction results.
- Handling Large Documents: For exceedingly long PDFs, further optimizations may be required to maintain real-time processing.
Future Directions
- Incorporate OCR (Optical Character Recognition): To handle PDFs that contain images with embedded text.
- Expand to Other File Types: Venturing beyond PDFs to support other formats like DOCX or PPT.
- Advanced Models: Exploring more advanced models from OpenAI or even fine-tuning models for specific domain knowledge.
powered by Advanced iFrame. Get the Pro version on CodeCanyon.
My Blog
Evaluating iSCSI vs NFS for Proxmox Storage: A Comparative Analysis
In the realm of virtualization, selecting the appropriate storage protocol is crucial for achieving optimal performance and efficiency. Proxmox VE, a prominent open-source platform for virtualization, offers a variety of storage options for Virtual Machines (VMs) and containers. Among these, iSCSI and NFS stand out as popular choices. This article aims to delve into the strengths and weaknesses of both iSCSI and NFS within the context of Proxmox storage, providing insights to help IT professionals make informed decisions.
Introduction to iSCSI and NFS
iSCSI (Internet Small Computer Systems Interface) is a storage networking standard that enables the transport of block-level data over IP networks. It allows clients (initiators) to send SCSI commands to SCSI storage devices (targets) on remote servers. iSCSI is known for its block-level access, providing the illusion of a local disk to the operating system.
NFS (Network File System), on the other hand, is a distributed file system protocol allowing a user on a client computer to access files over a network in a manner similar to how local storage is accessed. NFS operates at the file level, providing shared access to files and directories.
Performance Considerations
When evaluating iSCSI and NFS for Proxmox storage, performance is a critical factor.
iSCSI offers the advantage of block-level storage, which generally translates to faster write and read speeds compared to file-level storage. This can be particularly beneficial for applications requiring high I/O performance, such as databases. However, the performance of iSCSI can be heavily dependent on network stability and configuration. Proper tuning and a dedicated network for iSCSI traffic can mitigate potential bottlenecks.
NFS might not match the raw speed of iSCSI in block-level operations, but its simplicity and efficiency in handling file operations make it a strong contender. NFS v4.1 and later versions introduce performance enhancements and features like pNFS (parallel NFS) that can significantly boost throughput and reduce latency in environments with high-demand file operations.
Scalability and Flexibility
NFS shines in terms of scalability and flexibility. Being a file-based storage, NFS allows for easier management, sharing, and scaling of files across multiple clients. Its stateless protocol simplifies recovery from network disruptions. For environments requiring seamless access to shared files or directories, NFS is often the preferred choice.
iSCSI, while scalable, may require more intricate management as the storage needs grow. Each iSCSI target appears as a separate disk, and managing multiple disks across various VMs can become challenging. However, iSCSI’s block-level access provides a high degree of flexibility in terms of partitioning, file system choices, and direct VM disk access, which can be crucial for certain enterprise applications.
Ease of Setup and Management
From an administrative perspective, the ease of setup and ongoing management is another vital consideration.
NFS is generally perceived as easier to set up and manage, especially in Linux-based environments like Proxmox. The configuration is straightforward, and mounting NFS shares on VMs can be accomplished with minimal effort. This ease of use makes NFS an attractive option for smaller deployments or scenarios where IT resources are limited.
iSCSI requires a more detailed setup process, including configuring initiators, targets, and often dealing with CHAP authentication for security. While this complexity can be a barrier to entry, it also allows for fine-grained control over storage access and security, making iSCSI a robust choice for larger, security-conscious deployments.
Security and Reliability
Security and reliability are paramount in any storage solution.
iSCSI supports robust security features like CHAP authentication and can be integrated with IPsec for encrypted data transfer, providing a secure storage solution. The block-level access of iSCSI, while efficient, means that corruption on the block level can have significant repercussions, necessitating strict backup and disaster recovery strategies.
NFS, being file-based, might have more inherent vulnerabilities, especially in open networks. However, NFSv4 introduced improved security features, including Kerberos authentication and integrity checking. File-level storage also means that corruption is usually limited to individual files, potentially reducing the impact of data integrity issues.
Conclusion
Choosing between iSCSI and NFS for Proxmox storage depends on various factors, including performance requirements, scalability needs, administrative expertise, and security considerations. iSCSI offers high performance and fine-grained control suitable for intensive applications and large deployments. NFS, with its ease of use, flexibility, and file-level operations, is ideal for environments requiring efficient file sharing and simpler management. Ultimately, the decision should be guided by the specific needs and constraints of the deployment environment, with a thorough evaluation of both protocols’ strengths and weaknesses.
https://www.youtube.com/watch?v=2HfckwJOy7A
How to Resolve Proxmox VE Cluster Issues by Temporarily Stopping Cluster Services
- If you’re managing a Proxmox VE cluster, you might occasionally encounter situations where changes to cluster configurations become necessary, such as modifying the Corosync configuration or addressing synchronization issues. One effective method to safely make these changes involves temporarily stopping the cluster services. In this article, we’ll walk you through a solution that involves stopping the
pve-cluster
andcorosync
services, and starting the Proxmox configuration filesystem daemon in local mode.
Understanding the Components
Before diving into the solution, it’s crucial to understand the role of each component:
- pve-cluster: This service manages the Proxmox VE cluster’s configurations and coordination, ensuring that all nodes in the cluster are synchronized.
- corosync: The Corosync Cluster Engine provides the messaging and membership services that form the backbone of the cluster, facilitating communication between nodes.
- pmxcfs (Proxmox Cluster File System): This is a database-driven file system designed for storing cluster configurations. It plays a critical role in managing the cluster’s shared configuration data.
Step-by-Step Solution
When you need to make changes to your cluster configurations, follow these steps to ensure a safe and controlled update environment:
- Prepare for Maintenance: Notify any users of the impending maintenance and ensure that you have a backup of all critical configuration files. It’s always better to be safe than sorry.
- Stop the pve-cluster Service: Begin by stopping the
pve-cluster
service to halt the synchronization process across the cluster. This can be done by executing the following command in your terminal:systemctl stop pve-cluster
- Stop the corosync Service: Next, stop the Corosync service to prevent any cluster membership updates while you’re making your changes. Use this command:
systemctl stop corosync
- Start pmxcfs in Local Mode: With the cluster services stopped, you can now safely start the
pmxcfs
in local mode using the-l
flag. This allows you to work on the configuration files without immediate propagation to other nodes. - Make Your Changes: With
pmxcfs
running in local mode, proceed to make the necessary changes to your cluster configuration files. Remember, any modifications made during this time should be carefully considered and double-checked for accuracy. - Restart the Services: Once your changes are complete and verified, restart the
corosync
andpve-cluster
services to re-enable the cluster functionality.systemctl start corosync systemctl start pve-cluster
- Verify Your Work: After the services are back up, it’s essential to verify that your changes have been successfully applied and that the cluster is functioning as expected. Use Proxmox VE’s built-in diagnostic tools and commands to check the cluster’s status.
Conclusion
Modifying cluster configurations in a Proxmox VE environment can be a delicate process, requiring careful planning and execution. By temporarily stopping the pve-cluster
and corosync
services and leveraging the local mode of pmxcfs
, you gain a controlled environment to make and apply your changes safely. Always ensure that you have backups of your configuration files before proceeding and thoroughly test your changes to avoid unintended disruptions to your cluster’s operation.
Remember, while this method can be effective for various configuration changes, it’s crucial to consider the specific needs and architecture of your cluster. When in doubt, consult with Proxmox VE documentation or seek assistance from the community or professional support.
AWS – Key Differences Between Network Access Control Lists (NACLs) and Security Groups
In the realm of cloud computing, safeguarding your resources against unauthorized access is paramount. Two pivotal components that play a crucial role in this security paradigm are Network Access Control Lists (NACLs) and Security Groups. Although both serve the purpose of regulating access to network resources, they operate at different levels and have distinct functionalities. This article delves into the core differences between NACLs and Security Groups to help you better understand their roles and applications in cloud security.
What are NACLs?
Network Access Control Lists (NACLs) act as a firewall for controlling traffic at the subnet level within a Virtual Private Cloud (VPC). They provide a layer of security that controls both inbound and outbound traffic at the network layer. NACLs work by evaluating traffic based on rules that either allow or deny traffic entering or exiting a subnet. These rules are evaluated in order, and the first rule that matches the traffic determines whether it’s allowed or denied.
What are Security Groups?
Security Groups, on the other hand, function as virtual firewalls for individual instances or resources. They control inbound and outbound traffic at the instance level, ensuring that only the specified traffic can reach the resource. Unlike NACLs, Security Groups evaluate all rules before deciding, and if any rule allows the traffic, it is permitted.
Key Differences
-
Level of Application:
- NACLs: Operate at the subnet level, affecting all resources within that subnet.
- Security Groups: Applied directly to instances, providing granular control over individual resources.
-
Statefulness:
- NACLs: Stateless, meaning responses to allowed inbound traffic are subject to outbound rules, and vice versa.
- Security Groups: Stateful, allowing responses to allowed inbound traffic without requiring an outbound rule.
-
Rule Evaluation:
- NACLs: Process rules in a numbered order, with the first match determining the action.
- Security Groups: Evaluate all rules before deciding, allowing traffic if any rule permits it.
-
Default Behavior:
- NACLs: By default, deny all inbound and outbound traffic until rules are configured to allow traffic.
- Security Groups: Typically allow all outbound traffic and deny all inbound traffic by default, until specific allow rules are added.
-
Use Cases:
- NACLs: Ideal for broad, subnet-level rules, like blocking a specific IP range from accessing any resources in a subnet.
- Security Groups: Best suited for more granular, resource-specific rules, such as allowing web traffic to a web server but not to other types of instances.
Conclusion
Understanding the differences between NACLs and Security Groups is crucial for effectively managing network security in a cloud environment. While NACLs offer a first line of defense at the subnet level, Security Groups provide more granular control at the instance level. By leveraging both in your security strategy, you can ensure a robust defense-in-depth approach to securing your cloud resources.
Remember, the optimal use of NACLs and Security Groups depends on your specific security requirements and network architecture. It’s essential to carefully plan and implement these components to achieve the desired security posture for your cloud environment.
Unlocking Data Transformation with DBT: A Comprehensive Guide
Introduction:
In an era where data is akin to digital gold, the ability to refine this raw resource into actionable insights is crucial for any business aiming for success. DBT (Data Build Tool) emerges as a beacon of efficiency in the vast sea of data transformation tools, providing a structured, collaborative, and version-controlled environment for data analysts and engineers.
Illustration Suggestion: A visually striking image of a data landscape with a beam of light transforming raw data into a structured, golden dataset, symbolizing the enlightening power of DBT in the data transformation process.
What is DBT?:
DBT stands for Data Build Tool, an open-source software application that redefines the way data teams approach data transformation. It acts as a bridge between data engineering and data analysis, allowing teams to transform, test, and document data workflows efficiently. DBT treats data transformation as a craft, turning SQL queries into testable, deployable, and documentable artifacts.
Diving Deeper into DBT’s Key Features
- Version Control: DBT’s seamless integration with version control systems like Git ensures that every change to the data transformation scripts is tracked, allowing for collaborative development and historical versioning.
- Testing: With DBT, data reliability and accuracy take the forefront. DBT provides a framework for writing and executing tests against your data models, ensuring that the data meets the specified business rules and quality standards.
- Documentation: DBT automatically generates documentation from your data models, making it easier for teams to understand the data’s flow, dependencies, and transformations. This self-documenting aspect of DBT promotes better knowledge sharing and data governance within organizations.
The Transformative Impact of DBT
DBT empowers data teams by bridging the gap between data engineering and analytics, providing a platform that enhances data transformation workflows with efficiency, transparency, and quality. Its ability to treat transformations as code brings software engineering best practices into the data analytics realm, fostering a more collaborative and disciplined approach to data processing.
Implementing DBT in Your Data Stack
Implementing DBT starts with its installation and connecting it to your data warehouse. The next step involves defining your data models, which serve as the foundation for your transformations. DBT supports a wide range of data warehouses and databases, including Snowflake, BigQuery, Redshift, and more, ensuring compatibility and flexibility in various data ecosystems.
Expanding Horizons: DBT’s Applications in Different Industries
DBT’s versatility shines across various industries, from e-commerce to healthcare, finance, and beyond. It streamlines data operations, enabling businesses to harness their data effectively for reporting, analytics, and decision-making. DBT’s ability to manage complex data transformations at scale makes it a valuable asset for any data-driven organization looking to optimize its data workflows.
Conclusion: The Future of Data Transformation with DBT
In the rapidly evolving data landscape, DBT stands out as a tool that not only simplifies data transformation but also elevates it to a level where accuracy, efficiency, and collaboration are paramount. By adopting DBT, organizations can look forward to a future where data is not just a resource but a well-oiled engine driving informed decisions and strategic insights.
Join the DBT Revolution
Embracing DBT means stepping into a world where data transformation is no longer a bottleneck but a catalyst for growth and innovation. Whether you’re a data analyst, engineer, or business leader, DBT offers the tools and community support to transform your data practices. Dive into the DBT community, explore its rich resources, and start your journey towards data excellence.
Setting Up Nextcloud with Cloudflare Tunnel: A Guide
Looking to access your private Nextcloud setup outside of your local network? Let’s discover how to securely do this using the Cloudflare Tunnel. If you’ve got any concerns, remember, our Cloudflare Specialists are always available to assist.
Understanding the Nextcloud Cloudflare Tunnel
Personal Nextcloud setups on home servers have risen in popularity in our digital age. While it’s great within a local network, accessing it externally poses a challenge.
The solution? Cloudflare Tunnel.
The Cloudflare Tunnel offers a free yet robust method to link your server to Cloudflare without revealing your IP address. This ensures you can remotely access Nextcloud without compromising on security.
Step-by-Step Process
- Initiating the Connection: Begin by downloading and installing the cloudflared daemon. This establishes an encrypted channel between Cloudflare and your server.
wget -q https://bin.equinox.io/c/VdrWdbjqyF/cloudflared-stable-linux-amd64.deb
sudo dpkg -i cloudflared-stable-linux-amd64.deb
Bash- Server Authentication: Authenticate your server with Cloudflare.
cloudflared tunnel login
Bash- Tunnel Creation: Formulate a uniquely named tunnel.
cloudflared tunnel create [Your Tunnel Name]
Bash- Configuration: Draft a configuration file. Ensure it incorporates the precise tunnel ID and credentials.
tunnel: [Your Tunnel ID]
credentials-file: /path/to/your/credentials.json
ingress:
- hostname: nextcloud.yourwebsite.com
service: http://your.local.ip.address
- service: http_status:404
BashDNS Configuration: Visit Cloudflare’s dashboard to adjust DNS settings. Add a CNAME record that points to your tunnel ID. Multiple services on one tunnel? Create distinct CNAME entries for each.
Managing Your Tunnel
- Service Installation: The tunnel should always be active. Install it as a service.
sudo cloudflared service install
Bash- Service Management: Enable the Cloudflare service for consistent background running and auto-start upon booting.
sudo systemctl start cloudflared
sudo systemctl enable cloudflared
Bash- Configuration Updates: If you tweak the configuration, restart the service.
sudo systemctl restart cloudflared
BashTroubleshooting: Occasionally, post-setup, you might encounter the “Access through untrusted domain” issue. Resolve it by:
- Navigating to your Nextcloud’s config.php file.
- Incorporating your new domain into the
trusted_domains
list.
'trusted_domains' =>
array (
0 => 'localhost',
1 => 'your.local.ip.address',
2 => 'nextcloud.yourwebsite.com',
),
BashComparing the Performance and Cost of A100, V100, T4 GPUs, and TPU in Google Colab
Graphics Processing Units (GPUs) have revolutionized the world of computing, especially in areas that require high computational power such as deep learning, data analytics, and graphics rendering. With the rise of cloud platforms like Google Colab, users now have access to powerful GPUs and TPUs (Tensor Processing Units) for their computational tasks. In this article, we will delve into a comparative analysis of the A100, V100, T4 GPUs, and TPU available in Google Colab.
NVIDIA A100 GPU: The NVIDIA A100, based on the latest Ampere architecture, is a powerhouse in the world of GPUs. Designed primarily for data centers, it offers unparalleled computational speed, reportedly up to 20 times faster than its predecessors. Available in both 40 GB and 80 GB models, the 80 GB variant boasts the world’s fastest bandwidth at 2 TB/s. This GPU is ideal for high-performance computing tasks, especially in the realm of data science.
NVIDIA V100 GPU: The V100 is another beast from NVIDIA, tailored for data science and AI applications. With its ability to optimize memory usage, the 32 GB V100 can perform tasks equivalent to 100 computers simultaneously. However, it’s worth noting that the V100 may not be the best choice for gaming applications.
NVIDIA T4 GPU: The T4 is NVIDIA’s answer to the needs of deep learning, machine learning, and data analytics. It’s designed to be energy-efficient while still delivering high-speed computational power. While the A30 is said to be ten times faster, the T4 remains a reliable choice for specific workloads.
Google TPU: Google’s Tensor Processing Unit (TPU) is a custom-developed chip designed to accelerate machine learning tasks. Available in Google Colab, the TPU offers high-speed matrix computations, essential for deep learning models. While it’s not a GPU, its specialized architecture makes it a formidable competitor, especially for tensor-based computations.
Which One to Choose?:
- For Data Science and AI: Both the NVIDIA A100 and V100 are top contenders. The A100, with its latest architecture and unmatched speed, might edge out for most tasks. However, the V100 remains a solid choice.
- For Deep Learning and Machine Learning: The T4 is a reliable choice, but if tensor computations dominate your workload, the TPU might be more efficient.
- For Cost-Effectiveness: Google Colab offers free access to both GPUs and TPUs, but for prolonged and intensive tasks, it’s essential to consider the runtime limits and potential costs of using these resources on cloud platforms.
In conclusion, the choice between A100, V100, T4, and TPU depends on the specific requirements of the task at hand. Google Colab provides a fantastic platform to experiment and determine which suits best for your needs. As computational needs grow, it’s reassuring to know that such powerful tools are within easy reach.
About Us: At Prismanalytics, we understand the importance of high-performance computing. We offer solutions tailored to your needs, ensuring that you have access to the best resources for your tasks. Whether you’re a startup or an established enterprise, our platform is designed to provide efficiency, reliability, and speed at competitive prices. Join our growing list of satisfied customers and experience the difference.
My Resume
Education
Msc Data Science And Artificial Intelligence
2022 - 2023Training in data science & artificial intelligence methods, emphasizing mathematical and computer science perspectives.
Master In Management
EDHEC Business School (2005 - 2009)English Track Program - Major in Entrepreneurship.
BSc in Applied Mathematics and Social Sciences
University Paris 7 Denis Diderot (2006)General university studies with a focus on applied mathematics and social sciences.
Education
Higher School Preparatory Classes
Lycée Jacques Decour - Paris (2002 - 2004)Classe préparatoire aux Grandes Écoles de Commerce. Science path.
Scientific Baccalaureate
1999 - 2002Mathematics Major
Data Science
Python
SQL
Machine learning libraries
Data visualization tools
DESIGBig Data (Spark, Hive)
Data Analysis
Spreadsheet software
Data visualization tools (Tableau, PowerBI, and Matplotlib)
Statistical software (SAS, SPSS)
SAP Business Objects
Database management systems (SQL, MySQL)
Development
HTML
CSS
JAVASCRIPT
SOFTWARE
Version Control Systems
MLOps
CI/CD
Docker and Kubernetes
AutoML
Model serving frameworks
Prometheus, Grafana
Job Experience
Consulting, Automation and Security
(2019 - Present)Implementation of automated reporting tools via SAP Business Objects, Processing and securing sensitive data (data wrangling, encryption, redundancy), Remote monitoring management solutions via connected objects (IoT) and image processing, Internal pentesting and network security consulting, VPN implementation, Outsourcing of servers
Consulting, E-commerce and Digital Marketing
(2017 - 2019)Consulting in e-commerce and digital marketing (Bangkok area), Booking.com, Airbnb, Agoda online booking management for third parties, SEO in the hotel industry.
Consulting, internal company network
(2016 - 2017)Implementation of corporate networks and virtualization solutions (rack cabling, firewalls, proxmox virtualization), Management of firewalls and internal networks. (pfSense)
Entrepreneurship Experience
Founder, web developer
(2015 - 2020)Programming and maintenance of websites and web applications, Consulting in digitalization and process optimization for local SMEs, Implementation of turnkey e-commerce solutions.
Corporate Banking
Assistant Fund Manager
Credit Portfolio Management - CALYON - 2008● Preparation of committee notes for new ABS/CDO credit derivative investments ● Calculation and measurement of portfolio risk (Value-at-Risk, exotic and vanilla ABS, SWAP, liquidity lines) ● Daily monitoring of credit derivatives portfolio structures (Mark-to-Market, P&L, re-financing) ● Design of risk measurement and decision support tools in VBA.
Credit Risk Analyst
Risk and Controls Department – NATIXIS – ParisStudy of financing files for review by the credit committee (structured finance, commodity trade finance, and car manufacturers) ● Financial analysis, rating, and credit risk analysis of a portfolio of companies ● Financing files studied: from €1m to €1000m
Retail Banking
Assistant Business Account Manager
BNP Paribas - 2006● Writing reports on business plans for small SMEs ● Risk and feasibility Analysis and Decision making ● Negotiation of financing xpackages with applicants
Contact Me
Mehdi Fekih
Data Scientist.I am available for freelance work. Connect with me via this contact form or feel free to send me an email.
Phone: +33 (0) 7 82 90 60 71 Email: mehdi.fekih@edhec.com