Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
Adrian Wälchli 49ed2b102b | 2 weeks ago | |
---|---|---|
.. | ||
README.md | 8 months ago | |
app-cloud-e2e.yml | 7 months ago | |
gpu-benchmarks.yml | 2 weeks ago | |
gpu-tests-fabric.yml | 2 weeks ago | |
gpu-tests-pytorch.yml | 2 weeks ago | |
start.sh | 1 year ago |
This is a slightly modified version of the script from
https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/docker
apt-get update
apt-get install -y --no-install-recommends \
ca-certificates \
curl \
jq \
git \
iputils-ping \
libcurl4 \
libunwind8 \
netcat \
libssl1.0
curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash
mkdir /azp
export TARGETARCH=linux-x64
export AZP_URL="https://dev.azure.com/Lightning-AI"
export AZP_TOKEN="xxxxxxxxxxxxxxxxxxxxxxxxxx"
export AZP_POOL="lit-rtx-3090"
for i in {0..7..2}
do
nohup bash .azure/start.sh \
"AZP_AGENT_NAME=litGPU-YX_$i,$((i+1))" \
"CUDA_VISIBLE_DEVICES=$i,$((i+1))" \
> "agent-$i.log" &
done
ps aux | grep start.sh
Since most of our jobs/checks are running in a Docker container, the OS/machine can become polluted and fail to run with errors such as:
No space left on device : '/azp/agent-litGPU-21_0,1/_diag/pages/8bb191f4-a8c2-419a-8788-66e3f0522bea_1.log'
In such cases, you need to log in to the machine and run docker system prune
.
Let's explore adding a cron job for periodically removing all Docker caches:
crontab -e
--force
flag to force pruning without interactive confirmation:
# every day at 2:00 AM clean docker caches
0 2 * * * docker system prune --force
crontab -l
Note: You may need to add yourself to the Docker group by running sudo usermod -aG docker <your_username>
to have permission to execute this command without needing sudo
and entering the password.
No Description
Python YAML TSX Dockerfile Shell other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》