Questions tagged [data-manipulation]

1 votes
2 replies
How is it possible to extract sub-strings using keywords and index?
I'm trying to get certain sub-string following a keyword from a data string. These colleceted sub-strings are then joined together. Is there a...
asked 4 months ago
2 votes
1 replies
How to use mutate() to generate variables that depend on previous row values of other new variables?
I am trying to use dplyr's mutate() function to create new variables that depend on the previous row values of succeeding new variables. I've se...
asked 4 months ago
1 votes
1 replies
RegEx for matching numeric and decimals in dataframe
I have a column within a dataframe which has numbers followed with decimals which I want to remove to make it more tidy and sortable. How would i...
asked 4 months ago
0 votes
0 replies
flagging strings that appear in one vector but not another (R) [duplicate]
This question already has an answer here: Test if a vector contains a given element...
asked 4 months ago
1 votes
1 replies
Manipulation of list of data frames in a for loop
I'm preparing data frames for analysis in R. I can prepare them separately correctly but I want to place the preparation in a for loop (or apply/...
asked 4 months ago
-1 votes
3 replies
Is there a way to get an average per day from my dataset in Python?
I have a dataset with datetime and temperature that I get using a query to my database. I don't know how to get the average for each day. I want...
asked 4 months ago
-1 votes
2 replies
How to recode values of a variable based on the maxmium value in the variable, for hundreds of variables?
I want to recode the max value of a variable as 1 and 0 when it is not. For each variable, there may be multiple observations with the max value....
asked 4 months ago
0 votes
0 replies
Possible to create crosstab of single column in pyspark?
I am wanting to create a table that shows the cross tabulations of users belonging to each combination of segments in Pyspark. Below is a reprodu...
1 votes
2 replies
Printing a list of dictionaries as a table
How can I format the below data into tabular form using Python ? Is there any way to print/write the data as per the expected format ? [{"itemco...
asked 4 months ago
-3 votes
0 replies
I need to manipulate android sensor data
I am using a pedometer app which uses google fit api can anybody tell "how can i input wrong sensor data to the app so it can count fake steps"....
1 votes
2 replies
Ignore columns containing zeroes in each row and create a new object
I have a list object as follows: V1=c(5,5,5,5,5,5,5,5) V2=c(0,10,0,10,0,10,0,10) V3=c(0,0,15,15,0,0,15,15) V4=c(0,0,0,0,20,20,20,20) V5=c(25,25,...
asked 4 months ago
0 votes
2 replies
I am trying to assign a Holiday classifier to a list of dates
I have two dataframes, one with a list of dates and their corresponding holiday (df2), and another one with a list of transactions (df1). I'm try...
0 votes
1 replies
Reshape data set from wide to long format grouped by variable suffix
Similar yet different to this post:Reshaping data.frame from wide to long format I have a wide dataset with a unique ID variable and all other v...
0 votes
2 replies
Multiply each element of a column by each element of a different dataframe
I have two data frame both having same number of columns but the first data frame has multiple rows and the second one has only one row but same...
asked 4 months ago
0 votes
1 replies
How to reorder values in a row alphabetically using T-SQL?
I need to reorder the values in rows of a table by alphabetical order, for example: Id Values -------------------------------- 1 Bana...
asked 4 months ago
-1 votes
0 replies
R - Count occurences of a value between pairs of other values in a vector
I have a dataframe like below: col1 001 x x 002 001 002 x 003 004 x x 003 x 004 x x 005 005 x I would like to add the second...
asked 5 months ago
0 votes
2 replies
get only pure non numeric elements from column pandas
I have a data column like this: Phrase A4678LM AFNH 2l6m8 2312435 122 ABC HOW IS Pa805 and so on. Now this is a...
0 votes
1 replies
Need guidance with creating Django based dashboard
I'm a beginner at Django, and as a practice project I would like to create a webpage with a dashboard to track investments in a particular p2p pl...
-2 votes
1 replies
Google Apps Script: Sheets Forms Data Manipulation, Deleting Rows if Certain Cells are Blank, while Maintaining Certain Columns
This question is a continuation of the following: Google Apps Script: Sheets Forms Data Manipulation and Deleting Rows if Certain Cells are Blank...
0 votes
4 replies
How to remove dictionary items in list based on values in string
I'm busy extracting data with python 2.7 So far I got a list with dictionaries as items. For 2 days I cannot get any further with this. Data:...
0 votes
1 replies
Azure search - Import data which is in .md file
I'm trying to upload data using postman and getting an error: "The request entity's media type 'text/plain' is not supported for this resource."...
0 votes
1 replies
Parallelize for loop in R
I am trying to learn how to use parallel processing in R. A snapshot of the data and the code is provided below. Creating a rough dataset libra...
1 votes
2 replies
various transformations with lapply() - R
I have this df: df <- structure(list(Created = structure(6:1, .Label = c("2018-12-27T08:53:32.794-0300", "2018-12-27T17:46:00.244-0300", "20...
asked 5 months ago
3 votes
2 replies
convert multiple timezones into one - r
I have this dataframe: df <- data.frame(datetime = c("2018-08-23 11:03:25 0300", "2018-08-17 12:54:09 0300", "2018-08-07 17:15:29 0400", "201...
1 votes
1 replies
R dopar foreach on chunks instead of per line
This question is specific to using parallel processing in R using foreach and dopar. I have created a simple dataset and a simple operation (the...
asked 6 months ago