Strategy please

My project’s sole work is to extract question from MCQ PDFs. But usually answers are not along questions , they are written as answer key at last combined. But when i use groq models, vision based ‘llama series’ as groq does not support PDFs for them. The PDF are converted to images then send to models, but moels are unable to detect answer as they are at last, and i need them along question as in JSON format. if there were context catching for all my images on model, so i could have just aksed with a API request in order to get all answer at once. BUT CANT.

The way I’d do it is build a data model for all the questions — I’m guessing you’re working with multiple choice question PDFs — build a JSON representation of all the questions, choices, etc. — this will be your Questions list

Then when you’re transcribing answers, make sure to set it up so you’re able to map answers to the question; if you’re using scantron or something you could use the vertical/horizontal positions to map back to answer choice.

It’s really hard to help you without seeing what the PDFs look like

Thanks for response, The PDF are like these as attached .”https://drive.google.com/file/d/1qTmJEe2ABx8P6PU6cxNO1wYibZdTLYxQ/view?usp=sharing”

Hi! I’m taking a look right now, and the answer key seems odd, e.g. Question 6 is a multiple choice but has an answer of “(3)” so I think some of the answers are incorrect?

Anyway I’ve been taking a crack at a small implementation and will share shortly; it’s not perfect but I think the strategy is decent