🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Song text Sentiment Analysis

User: "mv070"
New Altair Community Member
Updated by Jocelyn
Hello,

i would like to do a sentiment analysis for songs. i have 250 songs but how do i do this. i use an XPath to get the songs and after some preprocessing all the songs are divided in different words. so each word is an different attribute, and not each song has every word attirubte. how do i continue cause im pretty stuck at this stage. 
i'm also thinking about having the whole text from the song as 1 attribute instead of a lot. 
Sort by:
1 - 4 of 41
    User: "SGolbert"
    New Altair Community Member
    Accepted Answer
    Updated by SGolbert
    Hi @mv070,

    the best you can do is import all songs as text and then use the text processing extension. In principle having one attribute per word is ok, you can use stemming to reduce the number a bit and maybe filter out some of the words (all things you can use with the text processing extension).

    I guess you don't have labels in the data. Then you have the option of using the Dictionary Based Sentiment operator. You will need to generate/download a dictionary for that.

    Edit: I developed a sample process
    <?xml version="1.0" encoding="UTF-8"?><process version="9.1.000"><br>  <context><br>    <input/><br>    <output/><br>    <macros/><br>  </context><br>  <operator activated="true" class="process" compatibility="9.1.000" expanded="true" name="Process" origin="GENERATED_TUTORIAL"><br>    <parameter key="logverbosity" value="init"/><br>    <parameter key="random_seed" value="2001"/><br>    <parameter key="send_mail" value="never"/><br>    <parameter key="notification_email" value=""/><br>    <parameter key="process_duration_for_mail" value="30"/><br>    <parameter key="encoding" value="SYSTEM"/><br>    <process expanded="true"><br>      <operator activated="true" class="subprocess" compatibility="9.1.000" expanded="true" height="82" name="Subprocess" origin="GENERATED_TUTORIAL" width="90" x="112" y="340"><br>        <process expanded="true"><br>          <operator activated="true" class="generate_data_user_specification" compatibility="9.1.000" expanded="true" height="68" name="Generate Data by User Specification" origin="GENERATED_TUTORIAL" width="90" x="45" y="34"><br>            <list key="attribute_values"><br>              <parameter key="Key" value="&quot;good&quot;"/><br>              <parameter key="Value" value="1"/><br>            </list><br>            <list key="set_additional_roles"/><br>          </operator><br>          <operator activated="true" class="generate_data_user_specification" compatibility="9.1.000" expanded="true" height="68" name="Generate Data by User Specification (2)" origin="GENERATED_TUTORIAL" width="90" x="45" y="136"><br>            <list key="attribute_values"><br>              <parameter key="Key" value="&quot;bad&quot;"/><br>              <parameter key="Value" value="-1"/><br>            </list><br>            <list key="set_additional_roles"/><br>          </operator><br>          <operator activated="true" class="append" compatibility="9.1.000" expanded="true" height="103" name="Append" origin="GENERATED_TUTORIAL" width="90" x="179" y="85"><br>            <parameter key="datamanagement" value="double_array"/><br>            <parameter key="data_management" value="auto"/><br>            <parameter key="merge_type" value="all"/><br>          </operator><br>          <connect from_op="Generate Data by User Specification" from_port="output" to_op="Append" to_port="example set 1"/><br>          <connect from_op="Generate Data by User Specification (2)" from_port="output" to_op="Append" to_port="example set 2"/><br>          <connect from_op="Append" from_port="merged set" to_port="out 1"/><br>          <portSpacing port="source_in 1" spacing="0"/><br>          <portSpacing port="sink_out 1" spacing="0"/><br>          <portSpacing port="sink_out 2" spacing="0"/><br>        </process><br>        <description align="center" color="transparent" colored="false" width="126">Generate dummy dictionary</description><br>      </operator><br>      <operator activated="true" class="operator_toolbox:dictionary_sentiment_learner" compatibility="1.7.000" expanded="true" height="82" name="Dictionary Based Sentiment" origin="GENERATED_TUTORIAL" width="90" x="380" y="340"><br>        <parameter key="value_attribute" value="Value"/><br>        <parameter key="key_attribute" value="Key"/><br>        <parameter key="negation_attribute" value=""/><br>        <parameter key="negation_window_size" value="1"/><br>        <parameter key="use_symmetric_negation_window" value="false"/><br>      </operator><br>      <operator activated="true" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document" width="90" x="112" y="34"><br>        <parameter key="text" value="the good, the bad and the ugly is a good film"/><br>        <parameter key="add label" value="false"/><br>        <parameter key="label_type" value="nominal"/><br>      </operator><br>      <operator activated="true" class="text:tokenize" compatibility="8.1.000" expanded="true" height="68" name="Tokenize (2)" width="90" x="246" y="34"><br>        <parameter key="mode" value="non letters"/><br>        <parameter key="characters" value=".:"/><br>        <parameter key="language" value="English"/><br>        <parameter key="max_token_length" value="3"/><br>      </operator><br>      <operator activated="true" class="text:create_document" compatibility="8.1.000" expanded="true" height="68" name="Create Document (2)" width="90" x="112" y="136"><br>        <parameter key="text" value="the good, the bad and the ugly is a bad bad film"/><br>        <parameter key="add label" value="false"/><br>        <parameter key="label_type" value="nominal"/><br>      </operator><br>      <operator activated="true" class="text:tokenize" compatibility="8.1.000" expanded="true" height="68" name="Tokenize (3)" width="90" x="246" y="136"><br>        <parameter key="mode" value="non letters"/><br>        <parameter key="characters" value=".:"/><br>        <parameter key="language" value="English"/><br>        <parameter key="max_token_length" value="3"/><br>      </operator><br>      <operator activated="true" class="collect" compatibility="9.1.000" expanded="true" height="103" name="Collect" width="90" x="447" y="34"><br>        <parameter key="unfold" value="false"/><br>      </operator><br>      <operator activated="true" class="operator_toolbox:apply_model_documents" compatibility="1.7.000" expanded="true" height="103" name="Apply Model (Documents)" width="90" x="581" y="187"><br>        <list key="application_parameters"/><br>      </operator><br>      <connect from_op="Subprocess" from_port="out 1" to_op="Dictionary Based Sentiment" to_port="exa"/><br>      <connect from_op="Dictionary Based Sentiment" from_port="mod" to_op="Apply Model (Documents)" to_port="mod"/><br>      <connect from_op="Create Document" from_port="output" to_op="Tokenize (2)" to_port="document"/><br>      <connect from_op="Tokenize (2)" from_port="document" to_op="Collect" to_port="input 1"/><br>      <connect from_op="Create Document (2)" from_port="output" to_op="Tokenize (3)" to_port="document"/><br>      <connect from_op="Tokenize (3)" from_port="document" to_op="Collect" to_port="input 2"/><br>      <connect from_op="Collect" from_port="collection" to_op="Apply Model (Documents)" to_port="doc"/><br>      <connect from_op="Apply Model (Documents)" from_port="exa" to_port="result 1"/><br>      <portSpacing port="source_input 1" spacing="0"/><br>      <portSpacing port="sink_result 1" spacing="0"/><br>      <portSpacing port="sink_result 2" spacing="0"/><br>    </process><br>  </operator><br></process><br><br>

    I'm also tagging @mschmitz because the tutorial process of the Apply Model (Documents) seems to be broken.

    I hope it helps!
    Regards,
    Sebastian



    User: "Telcontar120"
    New Altair Community Member
    Accepted Answer
    You can also build a model directly to predict sentiment but you would need to hand label some songs to train the model first, assign that as your label, and then build the model on that dataset (using appropriate validation strategies).  That might be fairly easy if it is a song corpus that you already know and can quickly decide whether a song is positive vs negative (or whatever other sentiment dimension you are going to be using).

    User: "mv070"
    New Altair Community Member
    OP
    thanks guys both ways worked perfectly for me. my teacher wanted me to do more ways so thanks a lot

    User: "sgenzer"
    Altair Employee