What do we have now
Rewriting API, query processing, deployment and Frontend
- Completely new frontend based on ReactJS. Now it’s possible to filter images using UI control. Also implemented: user registration, public/private collections, image uploading
- Kubernetes deployment
- public REST API. Now it’s possible to manage collections, upload images and execute search queries using REST endpoints
- Landing page
Getting grants from DigitalOcean and Amazon
We managed to get some credits ($13000) from DO and Amazon. Now we can afford to run all Khumbu services on powerful servers. Bye, Hetzner!
We use DO to run Khumbu core and AWS to process images (because DO doesn’t provide GPU instances)
Aug 2019-Sep 2019
Participation in YC Startup School
- In July all development was put on pause and it was decided to spend some time working on market research and slides
- Course driven by YC, can’t be useless, so we decided to take participation. It provided us with a lot of insights, fresh ideas and meetings
- Some feedback from investors
March 2019 – June 2019
- Primitive Django based UI with basic file uploader and user registration
- Implementation of distributed image processing
- Basic search based on pyparsing and ElasticSearch. Basically, Khumbu extracts information from images and sends it to ES
Jan 2019 – March 2019
- Image tagging and object detection (Tensorflow). Performance and accuracy analysis
- Face detection (dlib)
- Face recognition (dlib and FaceNet). Reading a lot of papers about large-scale face recognition
- Image captioning – im2txt, densecap etc
- Reading docs about EXIF
- Designing general architecture and approach to process millions of images (docker, nvidia-docker, RabbitMQ, Flask, ElasticSearch)
Initial idea – Dec 2018
- Thinking about possible use-cases, tech stack, architecture, pros and cons.
- Analysis of related products and projects
We already spent a lot of our own funds and it’s definitely will not be possible to reach next milestone without funding. So from February we will try to reach investors and startup incubators. If you know someone who could be potentially interested, please, contact us.
We implemented FR, but right now it’s disabled due to low performance and need some changes in architecture. We want to provide two possibilities:
- Detect faces of public persons like presidents and actors
- Allow user to upload a photo of any person. After that Khumbu will try to detect it in collection of images
- Use Facenet to extract face descriptors
- Use FLANN to build index of descriptors. But it’s quite hard to integrate it into current design.
It would be nice to extract more face attributes:
We have several issues with performance:
- Image uploading is slow
- Object detection takes significant amount of time and requires GPUs. Right now we run remote workers in AWS, but it’s very expensive. So we need to speed up this part and migrate(probably) to own servers.
- Make API production ready.
- Write documentation and examples
- Add some gateway (Kong), throttling, limitations etc
- Add billing
Currently search term is used as-is. So if you type water, system will not return rivers, oceans, lakes etc. Another example: for “human” system should return men, women, kids, toddles etc.
- Handle synonyms
- Handle related concepts (see above)
- Add autocomplete and suggestions
Improve object detection
- Improve performance
- Improve detection accuracy
- Improve detection of small objects
It would be cool to allow to upload they own models. So someone could train model to recognize cat breeds or car models, upload it to Khumbu and get possibility to search images with specific cats or cars.