Notes on issues setting up DeepSeek-Prover-V1.5 (Part 1) - Meaning in the measures of movements

If you’ve known me for some time you might be aware my background is in mathematics and statistics long before I ever got into data science, machine learning, and AI. I always enjoyed analysis and writing proofs as well as the foundations and philosophy of mathematics. Relatively early in my undergraduate studies I made it a point to read through my university library’s copy of Principia Mathematica. I’d never claim I fully understood it all, but I appreciated its goals and the rigor of the work speaks for itself. That alone made it something worth pushing through and trying to comprehend. It illustrated that in principle one could do these proofs on deeper mathematical results, though they would be complicated and time consuming (at least by hand). The Incompleteness Theorems of Gödel though showed that there are fundamental limits to a complete and consistent axiomatic system for mathematics, in fact the title of Gödel’s work is On Formally Undecidable Propositions of Principia Mathematica and Related Systems I, it is a response to the program goals of Principia Mathematica.

Around the same time in my studies I was introduced to a proof assistant called Prototype Verification System. It was tedious to use, but oddly satisfying when you got it right. I suppose somewhat similar to when you debug issues in an install or your own code in general. They are a practical and interactive way to write proofs using symbolic logic.

It hasn’t been a space I have thought much about for some time though, but looking at various benchmarks for Generative AI models I noticed there are some benchmarks and models focused on proof generation and reasoning. I was aware there are models focused on general mathematical reasoning of course, but that did not immediately click in my mind that there were active efforts to build proof copilots. I find the idea compelling and decided I wanted to get caught up and dabble with it myself. I noticed there’s a version of DeepSeek that is configured for proofs specifically for Lean 4. In some research I had also stumbled on a system called Metamath. So, I have some interest in picking up both of these at least so far as getting DeepSeek to work via Learn 4 and interfacing between it and Metamath. I’m still a ways away from that at this point, but that’s the aspiration.

So, I have been trying to get DeepSeek-Prover-V1.5 running. As their documentation calls out first you setup Learn 4, but if you read their documentation they have a version manager of their own called elan you have to install first to then configure your environment for Learn 4. This part wasn’t overly complicated, but probably worth a callout from the DeepSeek page that there are essentially two repos and instructions you need to look at in order to get Learn 4 going.

Afterwards, I was overcoming issues building the required Python environment for the model. The system I am using runs Ubuntu 22.04 I can provide more hardware specs later, but in order to host the model even with 4-bit quantization you need a fair amount of RAM since it is a 7B parameter model. Fortunately the system I am working on has 2x RTX 2080Ti Nvidia GPUs that are connected by NvLink.

So, I created a conda environment for this with Python 3.10 per the documentation using

conda create -n deepseek python=3.10

I then ran

pip install -r requirements.txt

as one does, but of course all sorts of problems happen so that’s my purpose here is to speak to what I ran across given gaps in their requirements.txt / overall environment setup instructions.

Even though requirements.txt contains torch==2.2.1 this won’t automatically grab the version you are hoping it will. pip tends to prefer grabbing CPU bound builds for PyTorch in my experience. So, we need to keep in mind the various libraries and their versions in mind, but break it up into some pieces for this to work.

If you cat requirements.txt you will see the following packages and versions:
pytz==2022.1 easydict==1.13 torch==2.2.1 transformers==4.40.1 vllm==0.4.1 numpy=-1.26.4 pandas==1.4.3 tabulate==0.9.0 termcolor==2.4.0 accelerate==0.33.0 flash_attn==2.6.3

In order to get around the first hurdle we need to do the PyTorch install on its own. You can look at the previous versions page on the PyTorch project to get the basic idea on how to install it.

The command I ran was: pip install torch==2.2.1 --index-url https://download.pytorch.org/whl/cu121

You’d think you could just delete Torch from the requirements.txt and run it now, but you’ll next hit an issue with flash_attn so keep that version in mind for the end and you can either modify requirements.txt or run pip with everything else directly. I did the latter.

pip install pytz==2022.1 easydict==1.13 transformers==4.40.1 vllm==0.4.1 numpy==1.26.4 pandas==1.4.3 tabulate==0.9.0 termcolor==2.4.0 accelerate==0.33.0

At this point you can quickly test PyTorch by running:
python -c "import torch; print(torch.__version__)"

Following what I have done the print should return: 2.2.1+cu121

Now others may or may not have the flash_attn issue I had. I am a bit stubborn and this particular host has been upgraded across multiple versions of Ubuntu rather than a clean install. So, there was a need to do fix an Nvidia issue before finally installing flash_attn and moving forward.

My system is running Nvidia driver 535.183.01 though because of how this system was being maintained when I ran nvcc --version you got 11.5.119 which doesn’t meet the minimum requirement of at least 11.6, but if you recall from above we setup 12.1 so unless you have a reason to be on a different version it does make sense to get the components to the same version.

After reviewing this step-by-step guide as well as this forum post and this one I was able to get CUDA 12.1 installed (do note this guide is for 12.2 so you have modify the steps slightly if you want 12.1 like I did).

What I did specifically was the following

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb sudo dpkg -i cuda-keyring_1.1-1_all.deb sudo apt-get update sudo apt-get -y install cuda-toolkit-12-1

You also need to edit your .bashrc file and add the following:

export PATH=/usr/local/cuda/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64\ ${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

You can then reload it by running

. ~/.bashrc

Then running nvcc --version returned V12.1.105

Now we can complete the requirements.txt by running:
pip install flash_attn==2.6.3

Will wrap this up for now. Next I’ll get DeepSeek-Prover itself running on the host. Stay tuned!

Leave a ReplyCancel Reply