Sort and count word pairs in a string

Joined
Dec 10, 2022
Messages
9
Reaction score
2
Dear all,

i need a little bit of help to count the co-ocurrence of words pairs frequencies per line in a text, as in how many times a word appears in the same line with other words. The list of co-occurrences pair of words are dynamic and placed in a textarea. Below an example of that co-occurrences list of words.

Code:
<textarea id=t1 name="text1" cols=60 rows=10></textarea><br><br>


cloud hosting
escape characters
escape characters
faster databases
for mission
for premium
hiring blogdocsget
hosting app
hosting cloud
lives upport
maintenance spaces
mission critical
need live
need live
need live
need live
need live
storage volumes
storage volumes
storage volumes
</textarea><br><br>

Code:
<textarea id=t1 name="result" cols=60 rows=10>
need live (5)
storage volumes (3)
cloud hosting (1)
escape characters (2)
faster databases (1)
for mission (1)
for premium (1)
hiring blogdocsget (1)
hosting app (1)
lives support (1)
maintenance spaces (1)
mission critical (1)

</textarea><br><br>

Thanks in advance!

BobKuspe:cool:
 
Joined
Jan 30, 2023
Messages
108
Reaction score
13
You can solve this problem by reading the text line by line and counting the frequency of each pair of words in the list of co-occurrences. Here is one way to solve it in Python:

Python:
def co_occurrence(text, word_list):
    word_count = {}
    for word in word_list:
        word_count[word] = 0
    
    for line in text.splitlines():
        for word1 in word_list:
            for word2 in word_list:
                if word1 != word2 and word1 in line and word2 in line:
                    word_count[word1 + ' ' + word2] += 1
    return word_count

text = '''cloud hosting
escape characters
escape characters
faster databases
for mission
for premium
hiring blogdocsget
hosting app
hosting cloud
lives upport
maintenance spaces
mission critical
need live
need live
need live
need live
need live
storage volumes
storage volumes
storage volumes
'''

word_list = ['cloud hosting', 'escape characters', 'faster databases', 'for mission', 'for premium',
             'hiring blogdocsget', 'hosting app', 'lives upport', 'maintenance spaces', 'mission critical',
             'need live', 'storage volumes']

result = co_occurrence(text, word_list)
for word, count in result.items():
    print(f'{word} ({count})')
 
Joined
Dec 10, 2022
Messages
9
Reaction score
2
You can solve this problem by reading the text line by line and counting the frequency of each pair of words in the list of co-occurrences. Here is one way to solve it in Python:

Python:
def co_occurrence(text, word_list):
    word_count = {}
    for word in word_list:
        word_count[word] = 0
    
    for line in text.splitlines():
        for word1 in word_list:
            for word2 in word_list:
                if word1 != word2 and word1 in line and word2 in line:
                    word_count[word1 + ' ' + word2] += 1
    return word_count

text = '''cloud hosting
escape characters
escape characters
faster databases
for mission
for premium
hiring blogdocsget
hosting app
hosting cloud
lives upport
maintenance spaces
mission critical
need live
need live
need live
need live
need live
storage volumes
storage volumes
storage volumes
'''

word_list = ['cloud hosting', 'escape characters', 'faster databases', 'for mission', 'for premium',
             'hiring blogdocsget', 'hosting app', 'lives upport', 'maintenance spaces', 'mission critical',
             'need live', 'storage volumes']

result = co_occurrence(text, word_list)
for word, count in result.items():
    print(f'{word} ({count})')
 
Joined
Dec 10, 2022
Messages
9
Reaction score
2
You can solve this problem by reading the text line by line and counting the frequency of each pair of words in the list of co-occurrences. Here is one way to solve it in Python:

Python:
def co_occurrence(text, word_list):
    word_count = {}
    for word in word_list:
        word_count[word] = 0
   
    for line in text.splitlines():
        for word1 in word_list:
            for word2 in word_list:
                if word1 != word2 and word1 in line and word2 in line:
                    word_count[word1 + ' ' + word2] += 1
    return word_count

text = '''cloud hosting
escape characters
escape characters
faster databases
for mission
for premium
hiring blogdocsget
hosting app
hosting cloud
lives upport
maintenance spaces
mission critical
need live
need live
need live
need live
need live
storage volumes
storage volumes
storage volumes
'''

word_list = ['cloud hosting', 'escape characters', 'faster databases', 'for mission', 'for premium',
             'hiring blogdocsget', 'hosting app', 'lives upport', 'maintenance spaces', 'mission critical',
             'need live', 'storage volumes']

result = co_occurrence(text, word_list)
for word, count in result.items():
    print(f'{word} ({count})')
Thank you
 
Joined
Jul 3, 2022
Messages
93
Reaction score
22
HTML:
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8" />
    <title></title>
    <style>
    
    </style>
  
  </head>
  <body>   
<textarea id="t1" name="text1" cols="60" rows="10"></textarea>
<br />
<br />
<button>Count</button>
<script>
const tarea = document.querySelector('#t1');

document.querySelector('button').addEventListener('click', function(){
const val = tarea.value.split('\n').filter( x => x.trim().match( /[\s\w+]/ ) ).sort(),
      o = {};
if(!val.length){alert('No words to count'); return;}
val.forEach( x => o[x] ? o[x]++ : o[x] = 1 );
tarea.value = Object.entries(o).map( a => `${a[0]} (${a[1]})` ).join('\n');
});
</script>
  </body>
</html>

in => cloud hosting escape characters escape characters faster databases for mission for premium hiring blogdocsget hosting app hosting cloud lives upport maintenance spaces mission critical need live need live need live need live need live storage volumes storage volumes storage volumes

out => cloud hosting (1) escape characters (2) faster databases (1) for mission (1) for premium (1) hiring blogdocsget (1) hosting app (1) hosting cloud (1) lives upport (1) maintenance spaces (1) mission critical (1) need live (5) storage volumes (3)
 
Joined
Jul 12, 2020
Messages
89
Reaction score
9
Bob here's an easier to understand version:
Code:
<body>
<script type="text/javascript" language="javascript">
var contents = '';
function checkTextareaEntries(){
   var Output = '';
   if(contents==''){
      contents = text1.value;
      var arr = text1.value.split('\r\n').sort();
      for(var i=0;i<arr.length;i++){
         if(Output.indexOf(arr[i])== -1){
            var count = '(' +countTextareaEntries(arr,arr[i])+ ')';
            Output += arr[i]+count+'\n';
         }
      }
      Output += '\nTotalEntries: ' + arr.length;
   }
   else{ Output=contents; contents=''; }
   text1.value = Output;
}

function countTextareaEntries(arr,ent){
   var count = 0;
     for(var i=0;i<arr.length;i++){
        if(arr[i]==ent){count++;}
     }
   return count;
}
</script>

<textarea id=t1 name="text1" cols=60 rows=10></textarea>
<br><br>
<button onclick="checkTextareaEntries()">Check Me</button>
</body>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,703
Messages
2,569,328
Members
44,612
Latest member
Billysak

Latest Threads

Top