Monday, January 28, 2019

Integration of Configuration class and FileBot class

Since I have the Configuration class ready, is time to integrate it into my FileBot class. As for now, only loadRemoteFS() is requiring data from INDEX file. Thus, integration should be much easier. Before this, I am using a list for temporary mimic the data available from INDEX file. This is what I do in my unit test (testSearch.cpp):
/*****   FileBot.cpp   *****/

std::list<string> FileBot::loadRemoteFS(vector<string> fileList, bool showCaptureList) {
   
   for( vector<string>::iterator it = fileList.begin(); it != fileList.end(); it++ ) {
   }

   ...
   ...
}


/*****   testSearch.cpp   *****/

BOOST_AUTO_TEST_CASE(TL_5, *boost::unit_test::precondition(skipTest(false)))
{
    ...
    ...

    vector<string> keyList;
    keyList.push_back("/home/puiyee/workspaceqt/debug/FolderA");
    keyList.push_back("/home/puiyee/workspaceqt/debug/FolderA/subA/subB/file_3.txt");
    vector<string> found;

    fb.loadRemoteFS(keyList);

    ...
}
Now I have replaced this chunk of code with a more elegant piece. I am creating a real XML file to mimic the INDEX file in createDestinationConfig() during the test. And now the loadRemoteFS() is no longer taking any std::list as input parameter, it will digest the INDEX file, it knows what data to look for, shallow it, and then produce the output.
/*****   FileBot.cpp   *****/

FileNode* FileBot::loadRemoteFS()
{
   ...

   pugi::xpath_node_set fileList = Configuration::getInstance()->retrieveRemoteFiles("/backup/file");
   pugi::xpath_node_set::const_iterator it = fileList.begin();
   for( ; it != fileList.end(); it++ ) {
   }

   ...
   ...
}


/*****   testSearch.cpp   *****/

BOOST_AUTO_TEST_CASE(TL_1, *boost::unit_test::precondition(skipTest(false)))
{
   ...
   ...

   std::vector<string> keyList;
   keyList.push_back("/home/puiyee/workspaceqt/debug/FolderA/subA/subB/file_3.txt");

   FileBotUnderTest fb;
   fb.createDestinationConfig("/home/puiyee/workspaceqt/debug/FolderA", keyList);
   fb.loadRemoteFS();

   ...
}
Don't confuse that the keyList mention in the test case above is required by the createDestinationConfig(). And also the first item of keyList, which indicate the root path of remote path is also not require anymore. Since it produces output, I need to verify the output to ensure consistency. I use this code at the end of unit test.
BOOST_AUTO_TEST_CASE(TL_1, *boost::unit_test::precondition(skipTest(true)))
{
    ...
    ...

    pugi::xml_document doc;
    pugi::xml_parse_result result = doc.load_file("backup.xml");
    if( result ) {
        pugi::xpath_node_set files = doc.select_nodes("/backup/file");
        BOOST_TEST(files.size() == 0);

        files = doc.select_nodes("/backup/recover/dest");
        for( pugi::xpath_node_set::const_iterator it = files.begin(); it != files.end(); it++ ) {
            pugi::xml_node file = ((pugi::xpath_node)*it).node();
            string value = file.text().get();

            BOOST_TEST(value.compare("/home/puiyee/workspaceqt/debug/FolderA/sub/file_2.txt") == 0);
        }

        files = doc.select_nodes("/backup/recover/src");
        BOOST_TEST(files.size() == 0);
    }
In this unit test, I will load the backup.xml and then verify the /backup/recover/dest does created in following format.
 <backup>
    <recover>
       <dest></dest>
    </recover>
 </backup>
And same goes to /backup/recover/src.

Friday, January 25, 2019

Introducing new member - Configuration

Time flies, almost 2 months since my last update. I was working on a new class to handle the INDEX file. This class was given a name as Configuration, and its sole responsibility is to work together with INDEX file. The INDEX file consists of XML tag containing information about a file structure being scanned.

During that 2 months, I was struggling with Boost with handling the XML file, I have completed roughly 60% of my work only after I found out it is not easy to remove an XML tag that I don't need anymore. Many try and error still unable to work it out, then I start to look for alternate solutions and I found out Pugi XML able to remove and update easily. With that, I begin to switch my code to Pugi. I was lucky that around 30% of rework need to be done.

When I first design on this class, I try to think of a lazy way to accomplish a task. I try to avoid to call initialize() when the class is first initialize. This doesn't look smart. So I do it in the constructor. It is an old method, but effective. When this class is first born, it will look for the configuration setting for the source and remote path. Every time this class is loaded into memory, it will look for the remote path. If it's empty, then initialize it, otherwise it is a source path.
Configuration::Configuration()
{
    defaultPath = filesystem::current_path().string() + GENERIC_PATH_SEPARATOR + INDEX_FILENAME;

    // create new one if the index file doesn't exists
    if( !boost::filesystem::exists(INDEX_FILENAME) ) {
        auto declareNode = doc.append_child(pugi::node_declaration);
        declareNode.append_attribute("version") = "1.0";
        declareNode.append_attribute("encoding") = "UTF-8";

        pathLookup(filesystem::current_path().string());
    }
    // load the index file if it exists
    else {
        readConfigFile();

        destPath = doc.child("backup").child("configuration").child("destination_path").child_value();
        sourcePath = doc.child("backup").child("configuration").child("source_path").child_value();
    }
}

string Configuration::pathLookup(string inputPath)
{
    // assign the new destination path if the field doesn't exists
    if( destPath.size() == 0 && sourcePath.size() == 0 ) {
        destPath = inputPath;

        pugi::xml_node destPathNode = doc.append_child("backup").append_child("configuration").append_child("destination_path");
        destPathNode.append_child(pugi::node_pcdata).set_value(inputPath.c_str());

        updateConfiguration();

        return destPath;
    }
    // assign the new source path if the field doesn't exists
    else if( sourcePath.size() == 0 && destPath.size() != 0 ) {
        sourcePath = inputPath;

        pugi::xml_node configNode = doc.child("backup").child("configuration");
        configNode.append_child("source_path").append_child(pugi::node_pcdata).set_value(inputPath.c_str());

        updateConfiguration();

        return sourcePath;
    }
    else
        return "";
}
Next is the content construction. This class must be able to construct the XML tag from a given input. For example, if I pass in the input like this, recover.source, then it must be able to construct as shown below:
<recover>
   <source></source>
</recover>
And not something like this:
<recover.source></recover.source>
Well, that's about the design, but the code behind this logic isn't straight forward. One condition is to validate whether it is allowed to duplicate, another is to check whether the XML tag exists, if it doesn't, then create it.
pugi::xml_node Configuration::allocateNode(string key, bool duplicateKey)
{
    char_separator<char> sep(".");
    tokenizer<char_separator<char> > token(key, sep);
    pugi::xml_node node;

    BOOST_FOREACH( const string& nodeName, token ) {
        qDebug() << "processing node name: " << nodeName.c_str();

        // retrieve the root node for the first time
        if( node.empty() ) {
            // create a new node if the root node was not found
            if( doc.child(nodeName.c_str()) == nullptr )
                node = doc.append_child(nodeName.c_str());
            // retrieve the root node
            else
                node = doc.child(nodeName.c_str());
        }
        else {
            // test if the child node is there
            if( node.child(nodeName.c_str()) == nullptr )
                node = node.append_child(nodeName.c_str());
            // retrieve the node
            else
                node = node.child(nodeName.c_str());
        }
    }

    return node;
}


void Configuration::writeValue(string key, bool duplicateKey, string value)
{
    bool allowInsert = true;

    string xpath = key;
    // convert key to XPath
    std::replace(xpath.begin(), xpath.end(), '.', '/');
    xpath = "/" + xpath;
    qDebug() << "XML node path: " << xpath.c_str();

    // duplicate value is not allowed
    if( !duplicateKey ) {
        // overwrite the value without validation
        pugi::xpath_node node = doc.select_node(xpath.c_str());
        if( !node.node().empty() ) {
            qDebug() << node.node().name() << " : " << node.node().text().get();

            node.node().text().set(value.c_str());
        }
        else {
            pugi::xml_node tmp = allocateNode(key, duplicateKey);
            tmp.text().set(value.c_str());
        }
    }
    else {
        // walk throught each node to check any duplicate value
        pugi::xpath_node_set files = doc.select_nodes(xpath.c_str());
        for( pugi::xpath_node_set::const_iterator it = files.begin(); it != files.end(); ++it ) {
            pugi::xpath_node file = *it;
            string val = file.node().text().get();

            qDebug() << file.node().name() << " : " << file.node().text().get();

            if( value.compare(val) == 0 )
                allowInsert = false;
        }

        // no duplicate value, proceed to insert the value
        if( allowInsert ) {
            // bail if this is equal to first node
            if( key.find_last_of(".") == -1 )
                return;

            string parentNode;
            parentNode = key.substr(0, key.find_last_of("."));

            if( parentNode.compare(key) == 0 )
                return;

            // is the XML node missing? Make a new one if it went missing
            pugi::xml_node node = allocateNode(parentNode, duplicateKey);

            string nodeName = key.substr(key.find_last_of(".") + 1, key.length());
            node = node.append_child(nodeName.c_str());
            node.append_child(pugi::node_pcdata).set_value(value.c_str());
        }
    }

    updateConfiguration();
}
Last but not least, this class is also able to remove an XML tag. The nodePath will tell which part of the XML tag will be removed, the removal condition must contain the value mention in the nodeValue.
void Configuration::removeNode(string nodePath, string nodeValue)
{
    // convert key to XPath
    std::replace(nodePath.begin(), nodePath.end(), '.', '/');
    nodePath = "/" + nodePath;
    qDebug() << "XML node path: " << nodePath.c_str();

    pugi::xml_node node;
    pugi::xpath_node_set nodes = doc.select_nodes(nodePath.c_str());
    for( pugi::xpath_node_set::const_iterator it = nodes.begin(); it != nodes.end(); ++it ) {
        pugi::xpath_node file = *it;
        string val = file.node().text().get();

        if( nodeValue.compare(val) == 0 ) {
            node = file.node();
            break;
        }
    }

    node.parent().remove_child(node);
    updateConfiguration();
}