Based off of section 12.5 from the book.
Goal: Create an index for a document: makeIndex :: Doc -> Index
Question: What types should Doc and Index have?
It makes sense to decompose makeIndex into a “pipeline” of steps:
makeIndex :: Doc -> Index
makeIndex
= lines
>.> numberLines
>.> allNumberedWords
>.> sortWords
>.> intsToLists
>.> groupByWord
>.> eliminateSmallWords
lines takes the document and splits it into a list of linesnumberLines takes the list of lines and adds line-numbers to them, forming pairs.allNumberedWords replaces each numbered line with a list of number-word pairs.sortWords reorders the list of number-word pairs by word.intsToLists turns each integer into an 1-integer list.groupByWord puts together those lists corresponding to the same word.eliminateSmallWords eliminates all words of length at most 4.What should be the types for each of these intermediate functions?
lines is a built-in method. What is its type? Does that match our usage of it?
numberLines is supposed to replace each line with the pair of an increasing number and the line. How can we implement that using list functions?
allNumberedWords is supposed to take each line and split it into a list of words, then put those words together with the line’s number. We can split this in steps:
Write each step.
sortWords sorts the list based on the word comparison.
intsToLists turns each integer into a 1-element list. You can do this via a map.
groupByWord needs to put together the lists corresponding to the same word. You need to actually write a function for that one, with cases for two consecutive elements having th same word.
eliminateSmallWords is a simple filter.