Skip to content
Snippets Groups Projects
Commit 749e2233 authored by uvos's avatar uvos
Browse files

add readmes

parent 8c4c56b5
No related branches found
No related tags found
No related merge requests found
# LLavaTagger
LLavaTagger is a python script that tags images based on a given prompt using the [LLaVA](https://llava-vl.github.io/) multi modal llm. LLavaTagger supports using any number of gpus in ddp parralel for this task.
## How to use
first create a python venv and install the required packages into it:
$ python -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
Then run LLavaTagger for instance like so:
$ python LLavaTagger.py --common_description "a image of a cat, " --prompt "describe the cat in 10 to 20 words" --batch 8 --quantize --image_dir ~/cat_images
By default LLavaTagger will run in parallel on all available gpus, if this is undesriable please use the ROCR_VISIBLE_DEVICES= or CUDA_VISIBLE_DEVICES= environment variable to hide unwanted gpus
LLavaTagger will then create a meta.jsonl in the image directory sutable to be used by the scripts of [diffusers](https://github.com/huggingface/diffusers) to train stable diffusion (xl) if other formats are desired ../utils contains scripts to transform the metadata into other formats for instace for the use with [kohya](https://github.com/bmaltais/kohya_ss)
If editing the created tags is desired, [QImageTagger](https://uvos.xyz/git/uvos/QImageTagger) can be used for this purpose
### PersonDatasetAssembler
PersonDatasetAssembler is a python script that finds images of a spcific person, specified by a referance image in a directory of images or in a video file. PersonDatasetAssembler supports also raw images.
## How to use
first create a python venv and install the required packages into it:
$ python -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt
Then run PersonDatasetAssembler for instance like so:
$ python PersonDatasetAssembler.py --referance someperson.jpg --match_model ../Weights/face_recognition_sface_2021dec.onnx --detect_model ../Weights/face_detection_yunet_2023mar.onnx --input ~/Photos --out imagesOfSomePerson
Or to extract images from a video:
$ python PersonDatasetAssembler.py --referance someperson.jpg --match_model ../Weights/face_recognition_sface_2021dec.onnx --detect_model ../Weights/face_detection_yunet_2023mar.onnx -i ~/SomeVideo.mkv --out imagesOfSomePerson
# SDImagePreprocess
This repo contains a collection of high performance tools intended to ease the createion of datasets for image generation AI training like stable diffusion.
## Included tools
This repo contains the following tools:
### SmartCrop
SmartCrop is an application that uses content aware croping using, [seam carving](https://en.wikipedia.org/wiki/Seam_carving) and resizeing to bring a directory of images into the deisred size and aspect ratio for training. SmartCrop ist configurable to prioritize specific items or specifc persons in the images provided.
#### Content detected in image:
![Content found in image](SmartCrop/images/IMGP3692.jpg)
#### Cropped image based on content:
![Cropped image](SmartCrop/images/IMGP3692C.jpg)
### PersonDatasetAssembler
PersonDatasetAssembler is a python script that finds images of a spcific person, specified by a referance image in a directory of images or in a video file. PersonDatasetAssembler supports also raw images.
### LLavaTagger
LLavaTagger is a python script that tags images based on a given prompt using the [LLaVA](https://llava-vl.github.io/) multi modal llm. LLavaTagger supports using any number of gpus in ddp parralel for this task.
### DanbooruTagger
DanbooruTagger is a python script of dubious utility that tags images based using the [DeepDanbooru](https://github.com/KichangKim/DeepDanbooru) convolutional network.
## License
All files in this repo are litcenced GPL V3, see LICENSE
# SmartCrop
SmartCrop is an application that uses content aware croping using, [seam carving](https://en.wikipedia.org/wiki/Seam_carving) and resizeing to bring a directory of images into the deisred size and aspect ratio for training. SmartCrop ist configurable to prioritize specific items or specifc persons in the images provided.
## Requirements
* [cmake](https://cmake.org/) 3.6 or later
* [opencv](https://opencv.org/) 4.8 or later
* A c++17 capable compiler and standard lib like gcc or llvm/clang
* git is required to get the source
## Building
The steps to build this application are:
$ git clone https://uvos.xyz/git/uvos/SDImagePreprocess.git
$ cd SDImagePreprocess
$ mkdir build
$ cmake ..
$ make
The binary can then be found in build/SmartCrop and can optionaly be installed with:
$ sudo make install
## Basic usage
To process all images in the directory ~/images and output the images into ~/proceesedImages:
$ smartcrop --out processedImages ~/images/*
To also focus on the person in the image ~/person.jpg
$ smartcrop --out processedImages --focus-person ~/person.jpg ~/images/*
To also enable seam carving
$ smartcrop --out processedImages --focus-person ~/person.jpg --seam-carving ~/images/*
see smartcrop --help for more
## Example
#### Content detected in image:
![Content found in image](images/IMGP3692.jpg)
#### Cropped image based on content:
![Cropped image](images/IMGP3692C.jpg)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment