An edTech test preparation company wanted to provide topically targeted tests to its students. They had a large MCQ question bank (several lakh questions) but the questions were not topically tagged. The challenge was that the text contained lots of mathematical and chemical symbols, structures, equations and diagrams etc. Thus, it became a complex text- and image-based classification problem. Radix developed a solution pipeline which gave a 92% accuracy (average recall) for about 30 topics in each of the four subjects