How to split text into paragraph from pdf document and extract their information?
I want to find a model of law splitting, which are pdf files. The law must split into Section, Part, Chapter, Article, Paragraph. It does not have to contain all of them. For example, one law may contain only Section and Part, while another may contain all of them. Also, after splitting, the information that the Section, Part, Chapter, Article and Paragraph may contain must be kept. All information should be displayed in separate columns in a table with as few errors as possible. The photo below shows all the possible ways in which a Greek law can be broken. Thanks in advance!